CpG-creating mutations are costly in many human viruses.
Caudill VR., Qin S., Winstead R., Kaur J., Tisthammer K., Pineda EG., Solis C., Cobey S., Bedford T., Carja O., Eggo RM., Koelle K., Lythgoe K., Regoes R., Roy S., Allen N., Aviles M., Baker BA., Bauer W., Bermudez S., Carlson C., Castellanos E., Catalan FL., Chemel AK., Elliot J., Evans D., Fiutek N., Fryer E., Goodfellow SM., Hecht M., Hopp K., Hopson ED., Jaberi A., Kinney C., Lao D., Le A., Lo J., Lopez AG., López A., Lorenzo FG., Luu GT., Mahoney AR., Melton RL., Nascimento GD., Pradhananga A., Rodrigues NS., Shieh A., Sims J., Singh R., Sulaeman H., Thu R., Tran K., Tran L., Winters EJ., Wong A., Pennings PS.
Mutations can occur throughout the virus genome and may be beneficial, neutral or deleterious. We are interested in mutations that yield a C next to a G, producing CpG sites. CpG sites are rare in eukaryotic and viral genomes. For the eukaryotes, it is thought that CpG sites are rare because they are prone to mutation when methylated. In viruses, we know less about why CpG sites are rare. A previous study in HIV suggested that CpG-creating transition mutations are more costly than similar non-CpG-creating mutations. To determine if this is the case in other viruses, we analyzed the allele frequencies of CpG-creating and non-CpG-creating mutations across various strains, subtypes, and genes of viruses using existing data obtained from Genbank, HIV Databases, and Virus Pathogen Resource. Our results suggest that CpG sites are indeed costly for most viruses. By understanding the cost of CpG sites, we can obtain further insights into the evolution and adaptation of viruses.