The origin of the novel SARS-CoV-2: Is it from natural selection or genetic engineering in the laboratory?


The outbreak of Severe Acute Respiratory Syndrome (SARS) Coronavirus 2 (SARS-CoV-2) pandemic in December 2019 has been devastating to mankind. It is undoubtedly a hallmark in human history, and will surely be recalled just like the ‘Spanish flu’ in the 20th century.

Coronaviruses are a family of RNA viruses that cause mild-to-very severe respiratory illnesses. The first of these coronavirus-related illnesses that emerged in 2003 was associated with the Severe Acute Respiratory Syndrome (SARS) epidemic in China. A second outbreak was associated with the Middle East Respiratory Syndrome (MERS) that emerged in 2012 in Saudi Arabia.

When something as catastrophic as a pandemic breaks out, it is expected that people will wonder about the ‘why’ and the ‘how’. Even though there could be perfect scientific reasons for these pandemics, it never stops those who look to the supernatural for the ‘why’ and the impetuous conspiracy theorists for the ‘how’.

Social media is currently inundated with a lot of rumours and conspiracy theories surrounding the origin of SARS-CoV-2. While credible information can help the populace understand the origin and mode of transmission for the virus, misinformation, rumours and conspiracy theories on the other hand can easily erode the gains made so far in tackling the menace.

Indeed, these unfounded stories could easily threaten the open and transparent sharing of reliable scientific data on the coronavirus outbreak. It stands to reason that besides social distancing and strict observation of the preventive protocols in these abnormal times, credible, authentic, verifiable and reliable information will be the next four ingredients needed to fight the menace.

Many social media posts have singled out the Wuhan Institute of Virology (WIV) for intense scrutiny. This is probably because besides bat coronaviruses (the coronavirus closest to SARS-CoV-2) as the core research theme at the institute, WIV is the only research institute with Biosafety Level 4 (BSL-4): the highest security level laboratory in China.

Consequently, some of the speculations and conspiracy theories on social media include the possibility that the virus was genetically engineered in the laboratory at WIV for biological warfare.

As a medical scientist, I would like to address the issue surrounding the source of this novel coronavirus from the angle of viral genetics and phylodynamics, and leave the other engineering-related conspiracies – i.e. 5G networks – to the experts.

Since viruses consist of protein envelopes encapsulating either de-oxyribonucleic acid (DNA) or ribonucleic acid (RNA), which are the primary infective materials, it is instructive to answer the question of viruses’ origin at the genomic level by first examining empirical scientific data from viral genomics (the study of viral genetics and their interactions with infected hosts) and phylodynamic analysis (analysis of natural selection, genetic diversity, and population dynamics of infectious disease pathogens as well as the intra-host behaviour during pandemics). This will not only help us to understand the genetic origin and trajectory of the novel coronavirus, but also to debunk (or not) the conspiracy theories about the virus’s origin.

The SARS-CoV-2 virus that emerged in the city of Wuhan, China, in December 2019 and has since caused a COVID-19 (the clinical disease manifestation) pandemic affecting more than 100 other countries, is seen as the product of natural evolution and not genetic engineering, according to findings published this year in the Journal Nature Medicine (March, 2020).

According to the team of researchers from Tulane University, University of Sidney, University of Edinburgh and Columbia University, the analyses of genome sequence data from SARS-CoV-2 and other related viruses showed no evidence that the novel virus is man-made or otherwise genetically engineered.

First, we need to appreciate the phenomenon of natural selection and then juxtapose that with artificial or genetic manipulation by man. Natural selection is a mechanism that was first proposed by the British scientist Charles Darwin in the 1850s.

He postulated: “Due to limited resources in nature, already existing organisms with heritable traits that favour survival and reproduction will reproduce more than their peers, causing the traits to increase in frequency over generations”. Genetic engineering – also called Genetic manipulation or Genetic modification – on the other hand is the “deliberate modification of an organism’s genes in an attempt to alter its characteristics”. In simplistic sense, the former is natural while the latter is artificial.

The Chinese health authorities alerted the World Health Organization (WHO) on December 31, 2019, of a possible outbreak of a new strain of coronavirus that was causing very severe illnesses among the population in Wuhan.

As of April 8, 2020, nearly 1,500,000 COVID-19 cases have been reported (minus the several asymptomatic and mild cases that have likely been undiagnosed), killing over 83,000 people worldwide. The severity of the disease made the Chinese scientists the first to sequence the entire genome of SARS-CoV-2 shortly after the pandemic began, and made the data publicly available to scientists worldwide.

The genomic sequence data of the virus revealed two things: (1) that the number of infections had been rising steadily due to human-to-human transmission after a single introduction into the human population; and (2) that the Chinese health authorities immediately detected a possible pandemic.

The genomic sequence data has since been used to unravel the origin(s) of SARS-CoV-2 by exploring the unique features of the virus. Scientists have consequently analysed the genetic template for two main proteins of the novel coronavirus:

  1. The receptor-binding domain (RBD) of the spike proteins that exist on the outside of the virus, used for hooking and penetrating human and animal cells.
  2. The cleavage site (CS), a protein sequence that enables the virus to crack open the host cells before penetration.

What is the phylodynamic evidence?

The team of eminent scientists discovered that the RBD portion of the SARS-CoV-2 spike protein had evolved to effectively target a molecular structure called ACE-2, which is a designated receptor on human cells. Using a computer simulation to construct SARS-CoV-2 with SARS-CoV as a template, the computer suggested other possible nucleotides in the region where SARS-CoV-2 had undergone a mutation.

When these nucleotides used by SARS-CoV-2 were inserted into the computer simulations, the new models failed to bind to human cells, like SARS-CoV-2 does! What’s the deduction?  – If the virus was genetically engineered in the laboratory, computer simulation would have alerted the evil scientist that the new coronavirus wasn’t possible from the way it was constructed.

From the analyses, the SARS-CoV-2 spike protein was so effective, 10x more effective, at binding ACE-2 on the human cells – to the point that the scientists concluded it was as a result of natural selection, and thus excluding the possibility of genetic engineering in the laboratory. Remember that, generally, genetically engineered proteins bind the wild-type receptors with much less affinity and avidity than their natural counterparts.

This evidence for natural evolution was further supported by the available genomic data on the overall molecular structure of SARS-CoV-2’s backbone. Generally and scientifically, in genetic engineering if a researcher sought to engineer a novel coronavirus as a biological weapon or an extremely infectious pathogen, the new construct would be based on the backbone of a virus that is already known to cause illness.

The scientists however discovered that the SARS-CoV-2 backbone differed substantially from those of existing coronaviruses, but significantly resembled related viruses found in bats and pangolins.

The scientific conclusion was that the two characteristics of the virus – i.e. the mutations in the RBD portion of the spike protein, and its unique backbone – rules out any laboratory engineering as the origin for SARS-CoV-2. Scientists from other laboratories have since confirmed these results, and assert that it is crucially important to bring an evidence-based scientific data to the conspiracies circulating about the origins of SARS-CoV-2.

The conclusion is that the novel coronavirus is the product of natural evolution and not genetic engineering. Conspiracy theorists will find it easier to blame evil scientists for the pandemic, but harder to accept the fact that aberrant everyday human behaviour may have brought us face-to-face with this pandemic

Likely origins for natural selection of SARS-CoV-2

After extensive analyses of genomic fingerprinting and sequence analyses, the team of scientists has concluded that the origin for SARS-CoV-2 follows one of two possible possibilities:

In the first scenario, the novel coronavirus evolved through natural selection to its current pathogenic state; first in a non-human host, and then later humans. Not surprisingly, previous coronavirus outbreaks emerged with the same pattern; with humans contracting the coronavirus after exposure to civet cats (SARS, 2003) and camels (MERS, 2012).

The researchers strongly proposed bats as the most likely reservoir for SARS-CoV-2 due to its genetic similarity to bat coronavirus (more than 96 percent similarity). There is also the possibility that an intermediate host was likely involved between bats and humans, since there is currently no documented proof of direct bat-human transmission.

In this first scenario, the two distinctive features of SARS-CoV-2 – the RBD portion of the spike protein that binds to cells, and the CS site that opens the ‘door’ of the coronavirus – probably evolved to their current pathogenic state before infecting humans. What this means is that the current pandemic probably emerged soon after humans were infected, since the novel coronavirus would then have acquired the pathogenic features which enable it to spread among people.

In the second scenario, an original non-pathogenic version of the novel coronavirus moved from a non-human to a human host – after which it evolved into its current pathogenic state within the human population. For instance, genomic analyses conducted on some pangolin coronaviruses found in Africa and Asia have an RBD structure very similar to that of SARS-CoV-2. Epidemiologically, it is possible that a coronavirus from a pangolin was transmitted to a human, either directly or through an intermediary host.

In this second scenario, the CS of SARS-CoV-2 could have evolved within a human host, circulated and gone undetected in the human population before onset of the pandemic. The scientists found that the CS of SARS-CoV-2 appears genetically similar to the CSs of some bird-flu strains which spread easily between humans.

Consequently, SARS-CoV-2 could have evolved a virulent cleavage site in human cells and thereafter initiated the pandemic. In this scenario, the novel coronavirus would have become more adept at spreading between humans.

The caution however lies in the difficulty (if not impossibility) in knowing which of the two scenarios is most likely. It is also scary to note that if this pathogenic coronavirus entered the human population from an animal source, then the possibility of future outbreaks still exists, since the disease-causing strain may still be circulating in the animal population and could once again jump into humans. Empirical research on this novel coronavirus has been challenging, and is still far off homing-in on the exact source of evolution in order to understand its transmission dynamics.

But why is the research on this novel coronavirus so challenging?

According to Dr. Jianfeng He, a former scientist at Wuhan Institute of Virology, the chief expert at Guangdong CDC, and Director of the Institute for Infectious Disease Control and Prevention in Guangdong, China: “This can be explained by the relationship between the virus and its hosts. They co-exist. If the host is a person, the premise of a good research is that he/she must survive.

“He/she can be diseased but cannot die. Once he/she dies, so does the virus. Without the stage (the patient) and the actor (the virus), the audience (the scientists) will be hard put to find the clues. SARS-CoV-2 is highly fatal, with a mortality reaching 10 percent. Without proper management, the case-fatality could even be much higher. Therefore, it has been quite difficult to carry out relevant in-vivo research.”


Despite the rumours, speculations and conspiracy theories, these genetic, phylodynamic, and epidemiological analyses by scientists offer a convincing perspective on the notable characteristics of the SARS-CoV-2 genome, as well as possible scenarios by which the virus could have arisen. The scientific conclusion from the data is that SARS-CoV-2 is not a genetically engineered or a wilfully manipulated virus.

>>>the writer is a Professor in Virology, Molecular Medicine and Nanotechnology at Regent University College of Science and Technology, Accra


