An open dataset of Plasmodium falciparum genome variation in 7,000 worldwide samples
Ahouidi A., Ali M., Almagro-Garcia J., Amambua-Ngwa A., Amaratunga C., Amato R., Amenga-Etego L., Andagalu B., Anderson TJC., Andrianaranjaka V., Apinjoh T., Ariani C., Ashley EA., Auburn S., Awandare GA., Ba H., Baraka V., Barry AE., Bejon P., Bertin GI., Boni MF., Borrmann S., Bousema T., Branch O., Bull PC., Busby GBJ., Chookajorn T., Chotivanich K., Claessens A., Conway D., Craig A., D'Alessandro U., Dama S., Day NPJ., Denis B., Diakite M., Djimdé A., Dolecek C., Dondorp AM., Drakeley C., Drury E., Duffy P., Echeverry DF., Egwang TG., Erko B., Fairhurst RM., Faiz A., Fanello CA., Fukuda MM., Gamboa D., Ghansah A., Golassa L., Goncalves S., Hamilton WL., Harrison GLA., Hart L., Henrichs C., Hien TT., Hill CA., Hodgson A., Hubbart C., Imwong M., Ishengoma DS., Jackson SA., Jacob CG., Jeffery B., Jeffreys AE., Johnson KJ., Jyothi D., Kamaliddin C., Kamau E., Kekre M., Kluczynski K., Kochakarn T., Konaté A., Kwiatkowski DP., Kyaw MP., Lim P., Lon C., Loua KM., Maïga-Ascofaré O., Malangone C., Manske M., Marfurt J., Marsh K., Mayxay M., Miles A., Miotto O., Mobegi V., Mokuolu OA., Montgomery J., Mueller I., Newton PN., Nguyen T., Nguyen T-N., Noedl H., Nosten F., Noviyanti R., Nzila A., Ochola-Oyier LI., Ocholla H., Oduro A., Omedo I., Onyamboko MA., Ouedraogo J-B., Oyebola K., Pearson RD., Peshu N., Phyo AP., Plowe CV., Price RN., Pukrittayakamee S., Randrianarivelojosia M., Rayner JC., Ringwald P., Rockett KA., Rowlands K., Ruiz L., Saunders D., Shayo A., Siba P., Simpson VJ., Stalker J., Su X-Z., Sutherland C., Takala-Harrison S., Tavul L., Thathy V., Tshefu A., Verra F., Vinetz J., Wellems TE., Wendler J., White NJ., Wright I., Yavo W., Ye H.
MalariaGEN is a data-sharing network that enables groups around the world to work together on the genomic epidemiology of malaria. Here we describe a new release of curated genome variation data on 7,000 Plasmodium falciparum samples from MalariaGEN partner studies in 28 malaria-endemic countries. High-quality genotype calls on 3 million single nucleotide polymorphisms (SNPs) and short indels were produced using a standardised analysis pipeline. Copy number variants associated with drug resistance and structural variants that cause failure of rapid diagnostic tests were also analysed. Almost all samples showed genetic evidence of resistance to at least one antimalarial drug, and some samples from Southeast Asia carried markers of resistance to six commonly-used drugs. Genes expressed during the mosquito stage of the parasite life-cycle are prominent among loci that show strong geographic differentiation. By continuing to enlarge this open data resource we aim to facilitate research into the evolutionary processes affecting malaria control and to accelerate development of the surveillance toolkit required for malaria elimination.