An open dataset of Plasmodium falciparum genome variation in 7,000 worldwide samples
MalariaGEN None., Ahouidi A., Ali M., Almagro-Garcia J., Amambua-Ngwa A., Amaratunga C., Amato R., Amenga-Etego L., Andagalu B., Anderson T., Andrianaranjaka V., Apinjoh T., Ariani C., Ashley E., Auburn S., Awandare G., Ba H., Baraka V., Barry A., Bejon P., Bertin G., Boni M., Borrmann S., Bousema T., Branch O., Bull P., Busby G., Chookajorn T., Chotivanich K., Claessens A., Conway D., Craig A., D'Alessandro U., Dama S., Day NPJ., Denis B., Diakite M., Djimdé A., Dolecek C., Dondorp A., Drakeley C., Drury E., Duffy P., Echeverry D., Egwang T., Erko B., Fairhurst R., Faiz A., Fanello C., Fukuda M., Gamboa D., Ghansah A., Golassa L., Goncalves S., Hamilton W., Harrison A., Hart L., Henrichs C., Hien TT., Hill C., Hodgson A., Hubbart C., Imwong M., Ishengoma D., Jackson S., Jacob C., Jeffery B., Jeffreys A., Johnson K., Jyothi D., Kamaliddin C., Kamau E., Kekre M., Kluczynski K., Kochakarn T., Konaté A., Kwiatkowski D., Kyaw MP., Lim P., Lon C., Loua K., Maïga-Ascofaré O., Malangone C., Manske M., Marfurt J., Marsh K., Mayxay M., Miles A., Miotto O., Mobegi V., Mokuolu O., Montgomery J., Mueller I., Newton P., Nguyen T., Nguyen T-N., Noedl H., Nosten F., Noviyanti R., Nzila A., Ochola-Oyier L., Ocholla H., Oduro A., Omedo I., Onyamboko M., Ouedraogo J-B., Oyebola K., Pearson R., Peshu N., Phyo AP., Plowe C., Price R., Pukrittayakamee S., Randrianarivelojosia M., Rayner J., Ringwald P., Rockett K., Rowlands K., Ruiz L., Saunders D., Shayo A., Siba P., Simpson V., Stalker J., Su X-Z., Sutherland C., Takala-Harrison S., Tavul L., Thathy V., Tshefu A., Verra F., Vinetz J., Wellems T., Wendler J., White N., Wright I., Yavo W., Ye H.
MalariaGEN is a data-sharing network that enables groups around the world to work together on the genomic epidemiology of malaria. Here we describe a new release of curated genome variation data on 7,000 Plasmodium falciparum samples from MalariaGEN partner studies in 28 malaria-endemic countries. High-quality genotype calls on 3 million single nucleotide polymorphisms (SNPs) and short indels were produced using a standardised analysis pipeline. Copy number variants associated with drug resistance and structural variants that cause failure of rapid diagnostic tests were also analysed. Almost all samples showed genetic evidence of resistance to at least one antimalarial drug, and some samples from Southeast Asia carried markers of resistance to six commonly-used drugs. Genes expressed during the mosquito stage of the parasite life-cycle are prominent among loci that show strong geographic differentiation. By continuing to enlarge this open data resource we aim to facilitate research into the evolutionary processes affecting malaria control and to accelerate development of the surveillance toolkit required for malaria elimination.