Ghosts in the Machine: The Mysterious World of Palaeovirology

Guest Author – James Ormiston
Palaeontology MSci Graduate

How do you do study an aspect of ancient life that doesn’t leave behind fossils? Is it even palaeontology if the thing you’re studying wasn’t technically alive in the first place? After all, that’s what the “onto” part means! These are the problems tackled by the strange sub-field of palaeovirology.

The “fossils” left by ancient viruses are not found in rocks, but as phantom whispers in the modern genomes of their hosts. So although it has “palaeo” in the name, and it involves studying “fossils”, palaeovirology and palaeontology are generally distinct fields. But they do share some key features. Real geological fossils do play a role in palaeovirological investigation, along with other shared techniques like gene sequencing and molecular clocks. And it turns out palaeovirology has a lot to say about our evolution, as a sizeable chunk of the 3 billion DNA base pairs in your genome were put there by viruses!   

A Temple Under Siege

These days we are relatively familiar with the biological concept of our bodies being not entirely our own. We are walking ecosystems; a cloud of diverse micro-organisms interspersed with a unified scaffolding of cells that constitute the human individual. It is also an ecosystem constantly under siege from nature’s very own killer robots, the viruses, and even our basic genetic blueprints are not free from contamination. This is easily demonstrated by the current wave of viral disease afflicting the world: SARS-CoV-2, more commonly referred to as COVID-19.

Schoolroom biology tells us that many viruses survive by implanting their genetic material into host cells, corrupting cellular machinery to make copies of themselves. By simple probability this usually happens to somatic cells, those that make up the majority of the body, but on occasion it can also happen to germline cells. Germline cells include sperm and eggs and are responsible for passing on genetic material to offspring. This means that they form a vanishingly small percentage of the body’s overall cell count. However it does not make them immune, and sometimes the imprint of viral infection becomes permanent…and inheritable. These imprints are called “endogenous viral elements”; EVEs for short.

Most EVEs in the modern human genome come from retroviruses (these are different to coronaviruses). Retroviruses are named so because they replicate in the reverse manner to other viruses (hence “retro”). In a nutshell, a retrovirus turns its genetic material, RNA, into DNA which it then inserts into the host cell’s genome. The host cell now has viral DNA mixed with its own, and proceeds to read it off and create the proteins needed to make new viruses.

When this happens in germline cells which get passed to the next generation, there is a chance that some of the offspring’s genome will preserve an artefact of the parent’s infection in the form of “junk” DNA left by the virus which it used for replication. These remnants are referred to as “endogenous retroviruses”, or ERVs, and account for 5-8% of the total human genome. This is where palaeovirology comes into the picture. We can use EVEs like ERVs to study ancient and extinct viruses that infected our evolutionary ancestors, and hopefully use that knowledge to combat future viral diseases. The original owners of EVEs are therefore called “palaeoviruses”.

Scanning electron microscope image of the retrovirus HIV-1 (green), responsible for AIDS. From Centre for Disease Control and Prevention.

The human genome is very large, large enough for a virus to have plenty of choice in where to insert its DNA. ERVs tend to occur in parts of the genome where there are no functional genes (non-coding regions), but even so, it is thought that a vast majority of ancient viral infections have left no trace at all. This is because evolution can purge the integrated DNA, and the process of genetic drift simply overwrites the non-coding regions of the genome over time. We can tell that this happens because Human T-Cell Leukemia retroviruses are known to have infected us around 20,000 years ago, but there is no genomic trace of them. Those ERVs that do survive become fixed in populations due to large scale, repeated infections. Since we can use phylogenetics, gene sequencing, mutation rate estimates and real fossils for time-calibrating (the “molecular clock”) to track these infection events through deep time, what is revealed is a parallel fossil record in our own DNA. By extension, we can apply this approach to the DNA of all living vertebrates to hunt down infection events caused by extinct viruses.

How virus DNA can be passed down to the host’s offspring after infection. Syncytin will be discussed later in this article. From Lavialle et al. (2013).

In the Court of the Crimson Queen

But there are other ways of detecting the presence of these prehistoric invaders, even in the absence of their DNA fragments. A sort of “secondary signal” in genomic evolution is present in the form of changes to antiviral defence genes. In any given viral infection event, existing antiviral genes are suddenly exposed to a lot of evolutionary forces as the population’s bodies try to fight it off. Some variants of these genes will help to repel the virus. The successful variants then become more established in the population and become fixed into its gene pool. This is because even if the virus doesn’t kill those who don’t have the beneficial variants, not being able to fight a virus effectively is still a big cost to fitness (fitness being a measure of the likelihood of an organism surviving long enough to reproduce), so natural selection will select against them.  On the other side of the battlefield, the viruses are also evolving in response to antiviral genes to ensure they can continue to parasitise the host; a classic evolutionary arms race!

A brief history of viral infections in primates. The red lightning bolts are recent viruses, the rest are “palaeoviruses” whose presence is inferred from changes to antiviral genes recorded in primate genomes. From Emerman & Malik (2010).

These arms races can therefore be identified by looking for significant changes in antiviral genes across an organism’s evolutionary “family tree”, so that even though there may be no EVEs we can be relatively certain that a palaeovirus of some sort was going through the population. But enough postulating, lets look at some examples.

TRIM5 is an antiviral gene we share with other primates, but its effectiveness varies depending on which primate it belongs to. This is especially important to us because the further back you go into the primate evolutionary tree, the more effective TRIM5 becomes against some viruses and the less effective it becomes against others. Most notable for us is what happens to HIV resistance. HIV has been plaguing humanity for decades, thought to have been transmitted zoonotically (meaning we caught it from animals, in this case primates). Reconstructions of TRIM5’s base sequence through its evolution show that the closer to humans it gets, the worse it is at combating HIV infection. This is because it has likely been affected by other palaeoviral infections, which caused it to change and lose its potency. At some point before humans diverged from chimpanzees, TRIM5 was afflicted with a mutation which severely stunted its effectiveness, and so we became vulnerable to HIV again.

Changes in the gene TRIM5 through primate evolution. The darker the bar becomes , the better at resisting HIV TRIM5 is. The dotted lines represent points in primate evolution where TRIM5 was sequenced for comparison. The box on the right highlights base pairs susceptible to mutation in red, in particular at point R332 (before the human/chimp split) which significantly impacted TRIM5’s effectiveness. From Emerman & Malik (2010).

So, TRIM5 demonstrates that palaeoviruses have had a longer-lasting impact on us than just their initial infection, with consequences reaching all the way to our present biology. We are the product of survival. The survival of individuals who had antiviral genes capable of fighting past viruses, antiviral genes that eventually became fixed in our populations. But this also means our immune systems are not often ready to fight new threats. This is, after all, why a new flu vaccine is needed every year. The goal posts are always being shifted by evolution.

Over long periods of time, antiviral genes that are no longer needed can hinder fitness as they serve no immediate function, so they start to deteriorate in the absence of the pathogen they fended off. These changes in response to palaeoviruses could be very useful in a wider sense. By including palaeovirology in palaeogeography, one could track ancient diseases across continents over large time scales. Palaeovirology could even help us develop better treatments in fields like gene therapy.

Biological Upcycling

Remember how earlier in this article I said that the viral DNA integrated into the germline genome is “junk”? Well as it happens this isn’t always the case. Although many ERVs become broken up and cleared out, some remain intact. Of these survivors some can be “resurrected” and become viral again, while others have been co-opted by their hosts to actually become functional genes – independently across different groups of animals. These genes can be kept for millions of years as they provide a survival advantage to the point of being critical for normal biological function.

Syncytin genes, for example, are found across eutherian mammals and are very important for reproduction. Their job lies in the interface between a mother and her foetus during pregnancy, helping to prevent rejection of the foetus by the mother’s immune system. When these genes are knocked out in mice, the placenta fails to develop properly and the embryo dies mid-gestation. Palaeovirological studies have revealed syncytins have their origins as ERVs, and could be key to understanding the evolution of the placenta from egg-laying ancestors. Even more interesting is that different lineages of eutherian mammals seem to have independently acquired different syncytins from different palaeoviruses over tens of millions of years. These newer ERVs replaced those already captured from previous viruses. It makes sense that these once belonged to viruses, since their job is to suppress the immune system, just the kind of thing that arises from host-parasite arms races.

In conclusion it seems that we all have a little bit of palaeovirus inside us, even to the extent that in some ways we couldn’t live without them, and our DNA is its own fossil record of our run-ins with pathogens of the past. As we are seeing in these tumultuous times, this is a relationship that will likely continue for as long as viruses have hosts to infect. Viruses need us, and although usually they make our lives more difficult, they have been instrumental in our evolution. This year has been a big one for virus research as over 4,000 genomes of COVID-19 have been sampled to aid the fight against it, but it has also been potentially important for palaeovirology. According to a pre-print paper (yet to receive peer review), a team in Japan have performed the first experimental demonstration of the phenomenon described here. A viral gene was reportedly successfully inserted into the germline of a mouse, which passed the gene onto its offspring and was detected in their tissues. A landmark moment because, if verified, although we have discovered many traces of this event it has never been observed in vivo…perhaps until now.

References and Further Reading:

Emerman & Malik (2010), “Paleovirology – Modern Consequences of Ancient Viruses.” PLoS Biol 8(2)

[Pre-print] Iida et al. (2020) “Heritable endogenization of an RNA virus in a mammalian species.” bioRxiv 2020.01.19.911933

Lavialle et al. (2013), “Paleovirology of ‘syncytins’, retroviral env genes exapted for a role in placentation.” Phil Trans R Soc B 368

Edited by Rhys Charles

Leave a Reply

Your email address will not be published.