register interest

David Aanensen

For endemic pathogens (and outbreak scenarios). epidemiological data combined with genomics can inform control strategies and interventions on a local, national and international scale. Data generation, integration, analytical flow and interpretation in real-time is challenging, but crucial for decision making and action.

Within The Centre for Genomic Pathogen Surveillance David and team focus on data flow and the use of genome sequencing for surveillance of microbial pathogens through a combination of web application and software engineering, methods development and large-scale structured pathogen surveys and sequencing of microbes with delivery of information for decision making.

Working with major public health agencies such as the US CDC, the European CDC, Public Health England and the WHO, systems are utilised to interpret and aid decision making for infection control.

David is also Director of the NIHR funded Global Health Research Unit on Genomic Surveillance of Antimicrobial Resistance working with partners leading National AMR strategies in The Phillipines, Colombia, Nigeria and India to implement genomic surveillance and linking to routine phenotypic and epidemiological data for priority pathogens.

Major Applications include:

Epicollect5 - Mobile data gathering platform used globally and by major health agencies, citizen scientists, ecologists, epidemiologists, business analytics, schools and colleges (..largely initiatives outside of the initial use cases..) over 14,000 projects and > 28Million data points.

Microreact - Open data visualisation and sharing for genomic epidemiology. Used by major agencies such as CDC, eCDC and PHE for routine investigation of public health incidents.

Pathogenwatch -  A global platform for genomic surveillance of microbial pathogens (including all major WHO Priority bacterial pathogens) Rapid prediction of resistant genotypes, and clustering giving epidemiological context.

There are no collaborations listed for this principal investigator.

David S, Reuter S, Harris SR, Glasner C, Feltwell T, Argimon S, Abudahab K, Goater R, Giani T, Errico G et al. 2019. Epidemic of carbapenem-resistant Klebsiella pneumoniae in Europe is driven by nosocomial spread. Nat Microbiol, 4 (11), pp. 1919-1929. | Show Abstract | Read more

Public health interventions to control the current epidemic of carbapenem-resistant Klebsiella pneumoniae rely on a comprehensive understanding of its emergence and spread over a wide range of geographical scales. We analysed the genome sequences and epidemiological data of >1,700 K. pneumoniae samples isolated from patients in 244 hospitals in 32 countries during the European Survey of Carbapenemase-Producing Enterobacteriaceae. We demonstrate that carbapenemase acquisition is the main cause of carbapenem resistance and that it occurred across diverse phylogenetic backgrounds. However, 477 of 682 (69.9%) carbapenemase-positive isolates are concentrated in four clonal lineages, sequence types 11, 15, 101, 258/512 and their derivatives. Combined analysis of the genetic and geographic distances between isolates with different β-lactam resistance determinants suggests that the propensity of K. pneumoniae to spread in hospital environments correlates with the degree of resistance and that carbapenemase-positive isolates have the highest transmissibility. Indeed, we found that over half of the hospitals that contributed carbapenemase-positive isolates probably experienced within-hospital transmission, and interhospital spread is far more frequent within, rather than between, countries. Finally, we propose a value of 21 for the number of single nucleotide polymorphisms that optimizes the discrimination of hospital clusters and detail the international spread of the successful epidemic lineage, ST258/512.

Lo SW, Gladstone RA, van Tonder AJ, Lees JA, du Plessis M, Benisty R, Givon-Lavi N, Hawkins PA, Cornick JE, Kwambana-Adams B et al. 2019. Pneumococcal lineages associated with serotype replacement and antibiotic resistance in childhood invasive pneumococcal disease in the post-PCV13 era: an international whole-genome sequencing study. Lancet Infect Dis, 19 (7), pp. 759-769. | Show Abstract | Read more

BACKGROUND: Invasive pneumococcal disease remains an important health priority owing to increasing disease incidence caused by pneumococci expressing non-vaccine serotypes. We previously defined 621 Global Pneumococcal Sequence Clusters (GPSCs) by analysing 20 027 pneumococcal isolates collected worldwide and from previously published genomic data. In this study, we aimed to investigate the pneumococcal lineages behind the predominant serotypes, the mechanism of serotype replacement in disease, as well as the major pneumococcal lineages contributing to invasive pneumococcal disease in the post-vaccine era and their antibiotic resistant traits. METHODS: We whole-genome sequenced 3233 invasive pneumococcal disease isolates from laboratory-based surveillance programmes in Hong Kong (n=78), Israel (n=701), Malawi (n=226), South Africa (n=1351), The Gambia (n=203), and the USA (n=674). The genomes represented pneumococci from before and after pneumococcal conjugate vaccine (PCV) introductions and were from children younger than 3 years. We identified predominant serotypes by prevalence and their major contributing lineages in each country, and assessed any serotype replacement by comparing the incidence rate between the pre-PCV and PCV periods for Israel, South Africa, and the USA. We defined the status of a lineage as vaccine-type GPSC (≥50% 13-valent PCV [PCV13] serotypes) or non-vaccine-type GPSC (>50% non-PCV13 serotypes) on the basis of its initial serotype composition detected in the earliest vaccine period to measure their individual contribution toward serotype replacement in each country. Major pneumococcal lineages in the PCV period were identified by pooled incidence rate using a random effects model. FINDINGS: The five most prevalent serotypes in the PCV13 period varied between countries, with only serotypes 5, 12F, 15B/C, 19A, 33F, and 35B/D common to two or more countries. The five most prevalent serotypes in the PCV13 period varied between countries, with only serotypes 5, 12F, 15B/C, 19A, 33F, and 35B/D common to two or more countries. These serotypes were associated with more than one lineage, except for serotype 5 (GPSC8). Serotype replacement was mainly mediated by expansion of non-vaccine serotypes within vaccine-type GPSCs and, to a lesser extent, by increases in non-vaccine-type GPSCs. A globally spreading lineage, GPSC3, expressing invasive serotypes 8 in South Africa and 33F in the USA and Israel, was the most common lineage causing non-vaccine serotype invasive pneumococcal disease in the PCV13 period. We observed that same prevalent non-vaccine serotypes could be associated with distinctive lineages in different countries, which exhibited dissimilar antibiotic resistance profiles. In non-vaccine serotype isolates, we detected significant increases in the prevalence of resistance to penicillin (52 [21%] of 249 vs 169 [29%] of 575, p=0·0016) and erythromycin (three [1%] of 249 vs 65 [11%] of 575, p=0·0031) in the PCV13 period compared with the pre-PCV period. INTERPRETATION: Globally spreading lineages expressing invasive serotypes have an important role in serotype replacement, and emerging non-vaccine serotypes associated with different pneumococcal lineages in different countries might be explained by local antibiotic-selective pressures. Continued genomic surveillance of the dynamics of the pneumococcal population with increased geographical representation in the post-vaccine period will generate further knowledge for optimising future vaccine design. FUNDING: Bill & Melinda Gates Foundation, Wellcome Sanger Institute, and the US Centers for Disease Control.

McNally A, Kallonen T, Connor C, Abudahab K, Aanensen DM, Horner C, Peacock SJ, Parkhill J, Croucher NJ, Corander J. 2019. Diversification of Colonization Factors in a Multidrug-Resistant Escherichia coli Lineage Evolving under Negative Frequency-Dependent Selection. MBio, 10 (2), | Show Abstract | Read more

Escherichia coli is a major cause of bloodstream and urinary tract infections globally. The wide dissemination of multidrug-resistant (MDR) strains of extraintestinal pathogenic E. coli (ExPEC) poses a rapidly increasing public health burden due to narrowed treatment options and increased risk of failure to clear an infection. Here, we present a detailed population genomic analysis of the ExPEC ST131 clone, in which we seek explanations for its success as an emerging pathogenic strain beyond the acquisition of antimicrobial resistance (AMR) genes. We show evidence for evolution toward separate ecological niches for the main clades of ST131 and differential evolution of anaerobic metabolism, key colonization, and virulence factors. We further demonstrate that negative frequency-dependent selection acting across accessory loci is a major mechanism that has shaped the population evolution of this pathogen.IMPORTANCE Infections with multidrug-resistant (MDR) strains of Escherichia coli are a significant global public health concern. To combat these pathogens, we need a deeper understanding of how they evolved from their background populations. By understanding the processes that underpin their emergence, we can design new strategies to limit evolution of new clones and combat existing clones. By combining population genomics with modelling approaches, we show that dominant MDR clones of E. coli are under the influence of negative frequency-dependent selection, preventing them from rising to fixation in a population. Furthermore, we show that this selection acts on genes involved in anaerobic metabolism, suggesting that this key trait, and the ability to colonize human intestinal tracts, is a key step in the evolution of MDR clones of E. coli.

Fisher MC, Ghosh P, Shelton JMG, Bates K, Brookes L, Wierzbicki C, Rosa GM, Farrer RA, Aanensen DM, Alvarado-Rybak M et al. 2018. Development and worldwide use of non-lethal, and minimal population-level impact, protocols for the isolation of amphibian chytrid fungi. Sci Rep, 8 (1), pp. 7772. | Show Abstract | Read more

Parasitic chytrid fungi have emerged as a significant threat to amphibian species worldwide, necessitating the development of techniques to isolate these pathogens into culture for research purposes. However, early methods of isolating chytrids from their hosts relied on killing amphibians. We modified a pre-existing protocol for isolating chytrids from infected animals to use toe clips and biopsies from toe webbing rather than euthanizing hosts, and distributed the protocol to researchers as part of the BiodivERsA project RACE; here called the RML protocol. In tandem, we developed a lethal procedure for isolating chytrids from tadpole mouthparts. Reviewing a database of use a decade after their inception, we find that these methods have been applied across 5 continents, 23 countries and in 62 amphibian species. Isolation of chytrids by the non-lethal RML protocol occured in 18% of attempts with 207 fungal isolates and three species of chytrid being recovered. Isolation of chytrids from tadpoles occured in 43% of attempts with 334 fungal isolates of one species (Batrachochytrium dendrobatidis) being recovered. Together, these methods have resulted in a significant reduction and refinement of our use of threatened amphibian species and have improved our ability to work with this group of emerging pathogens.

Rhodes J, Abdolrasouli A, Farrer RA, Cuomo CA, Aanensen DM, Armstrong-James D, Fisher MC, Schelenz S. 2018. Genomic epidemiology of the UK outbreak of the emerging human fungal pathogen Candida auris. Emerg Microbes Infect, 7 (1), pp. 43. | Show Abstract | Read more

Candida auris was first described in 2009, and it has since caused nosocomial outbreaks, invasive infections, and fungaemia across at least 19 countries on five continents. An outbreak of C. auris occurred in a specialized cardiothoracic London hospital between April 2015 and November 2016, which to date has been the largest outbreak in the UK, involving a total of 72 patients. To understand the genetic epidemiology of C. auris infection both within this hospital and within a global context, we sequenced the outbreak isolate genomes using Oxford Nanopore Technologies and Illumina platforms to detect antifungal resistance alleles and reannotate the C. auris genome. Phylogenomic analysis placed the UK outbreak in the India/Pakistan clade, demonstrating an Asian origin; the outbreak showed similar genetic diversity to that of the entire clade, and limited local spatiotemporal clustering was observed. One isolate displayed resistance to both echinocandins and 5-flucytosine; the former was associated with a serine to tyrosine amino acid substitution in the gene FKS1, and the latter was associated with a phenylalanine to isoleucine substitution in the gene FUR1. These mutations add to a growing body of research on multiple antifungal drug targets in this organism. Multiple differential episodic selection of antifungal resistant genotypes has occurred within a genetically heterogenous population across this outbreak, creating a resilient pathogen and making it difficult to define local-scale patterns of transmission and implement outbreak control measures.

Park SE, Pham DT, Boinett C, Wong VK, Pak GD, Panzner U, Espinoza LMC, von Kalckreuth V, Im J, Schütt-Gerowitt H et al. 2018. The phylogeography and incidence of multi-drug resistant typhoid fever in sub-Saharan Africa. Nat Commun, 9 (1), pp. 5094. | Show Abstract | Read more

There is paucity of data regarding the geographical distribution, incidence, and phylogenetics of multi-drug resistant (MDR) Salmonella Typhi in sub-Saharan Africa. Here we present a phylogenetic reconstruction of whole genome sequenced 249 contemporaneous S. Typhi isolated between 2008-2015 in 11 sub-Saharan African countries, in context of the 2,057 global S. Typhi genomic framework. Despite the broad genetic diversity, the majority of organisms (225/249; 90%) belong to only three genotypes, 4.3.1 (H58) (99/249; 40%), 3.1.1 (97/249; 39%), and 2.3.2 (29/249; 12%). Genotypes 4.3.1 and 3.1.1 are confined within East and West Africa, respectively. MDR phenotype is found in over 50% of organisms restricted within these dominant genotypes. High incidences of MDR S. Typhi are calculated in locations with a high burden of typhoid, specifically in children aged <15 years. Antimicrobial stewardship, MDR surveillance, and the introduction of typhoid conjugate vaccines will be critical for the control of MDR typhoid in Africa.

Abudahab K, Prada JM, Yang Z, Bentley SD, Croucher NJ, Corander J, Aanensen DM. 2019. PANINI: Pangenome Neighbour Identification for Bacterial Populations. Microb Genom, 5 (4), | Show Abstract | Read more

The standard workhorse for genomic analysis of the evolution of bacterial populations is phylogenetic modelling of mutations in the core genome. However, a notable amount of information about evolutionary and transmission processes in diverse populations can be lost unless the accessory genome is also taken into consideration. Here, we introduce panini (Pangenome Neighbour Identification for Bacterial Populations), a computationally scalable method for identifying the neighbours for each isolate in a data set using unsupervised machine learning with stochastic neighbour embedding based on the t-SNE (t-distributed stochastic neighbour embedding) algorithm. panini is browser-based and integrates with the Microreact platform for rapid online visualization and exploration of both core and accessory genome evolutionary signals, together with relevant epidemiological, geographical, temporal and other metadata. Several case studies with single- and multi-clone pneumococcal populations are presented to demonstrate the ability to identify biologically important signals from gene content data. panini is available at http://panini.pathogen.watch and code at http://gitlab.com/cgps/panini.

Richardson EJ, Bacigalupe R, Harrison EM, Weinert LA, Lycett S, Vrieling M, Robb K, Hoskisson PA, Holden MTG, Feil EJ et al. 2018. Gene exchange drives the ecological success of a multi-host bacterial pathogen. Nat Ecol Evol, 2 (9), pp. 1468-1478. | Show Abstract | Read more

The capacity for some pathogens to jump into different host-species populations is a major threat to public health and food security. Staphylococcus aureus is a multi-host bacterial pathogen responsible for important human and livestock diseases. Here, using a population-genomic approach, we identify humans as a major hub for ancient and recent S. aureus host-switching events linked to the emergence of endemic livestock strains, and cows as the main animal reservoir for the emergence of human epidemic clones. Such host-species transitions are associated with horizontal acquisition of genetic elements from host-specific gene pools conferring traits required for survival in the new host-niche. Importantly, genes associated with antimicrobial resistance are unevenly distributed among human and animal hosts, reflecting distinct antibiotic usage practices in medicine and agriculture. In addition to gene acquisition, genetic diversification has occurred in pathways associated with nutrient acquisition, implying metabolic remodelling after a host switch in response to distinct nutrient availability. For example, S. aureus from dairy cattle exhibit enhanced utilization of lactose-a major source of carbohydrate in bovine milk. Overall, our findings highlight the influence of human activities on the multi-host ecology of a major bacterial pathogen, underpinned by horizontal gene transfer and core genome diversification.

Harris SR, Cole MJ, Spiteri G, Sánchez-Busó L, Golparian D, Jacobsson S, Goater R, Abudahab K, Yeats CA, Bercot B et al. 2018. Public health surveillance of multidrug-resistant clones of Neisseria gonorrhoeae in Europe: a genomic survey. Lancet Infect Dis, 18 (7), pp. 758-768. | Show Abstract | Read more

BACKGROUND: Traditional methods for molecular epidemiology of Neisseria gonorrhoeae are suboptimal. Whole-genome sequencing (WGS) offers ideal resolution to describe population dynamics and to predict and infer transmission of antimicrobial resistance, and can enhance infection control through linkage with epidemiological data. We used WGS, in conjunction with linked epidemiological and phenotypic data, to describe the gonococcal population in 20 European countries. We aimed to detail changes in phenotypic antimicrobial resistance levels (and the reasons for these changes) and strain distribution (with a focus on antimicrobial resistance strains in risk groups), and to predict antimicrobial resistance from WGS data. METHODS: We carried out an observational study, in which we sequenced isolates taken from patients with gonorrhoea from the European Gonococcal Antimicrobial Surveillance Programme in 20 countries from September to November, 2013. We also developed a web platform that we used for automated antimicrobial resistance prediction, molecular typing (N gonorrhoeae multi-antigen sequence typing [NG-MAST] and multilocus sequence typing), and phylogenetic clustering in conjunction with epidemiological and phenotypic data. FINDINGS: The multidrug-resistant NG-MAST genogroup G1407 was predominant and accounted for the most cephalosporin resistance, but the prevalence of this genogroup decreased from 248 (23%) of 1066 isolates in a previous study from 2009-10 to 174 (17%) of 1054 isolates in this survey in 2013. This genogroup previously showed an association with men who have sex with men, but changed to an association with heterosexual people (odds ratio=4·29). WGS provided substantially improved resolution and accuracy over NG-MAST and multilocus sequence typing, predicted antimicrobial resistance relatively well, and identified discrepant isolates, mixed infections or contaminants, and multidrug-resistant clades linked to risk groups. INTERPRETATION: To our knowledge, we provide the first use of joint analysis of WGS and epidemiological data in an international programme for regional surveillance of sexually transmitted infections. WGS provided enhanced understanding of the distribution of antimicrobial resistance clones, including replacement with clones that were more susceptible to antimicrobials, in several risk groups nationally and regionally. We provide a framework for genomic surveillance of gonococci through standardised sampling, use of WGS, and a shared information architecture for interpretation and dissemination by use of open access software. FUNDING: The European Centre for Disease Prevention and Control, The Centre for Genomic Pathogen Surveillance, Örebro University Hospital, and Wellcome.

O'Hanlon SJ, Rieux A, Farrer RA, Rosa GM, Waldman B, Bataille A, Kosch TA, Murray KA, Brankovics B, Fumagalli M et al. 2018. Recent Asian origin of chytrid fungi causing global amphibian declines. Science, 360 (6389), pp. 621-627. | Show Abstract | Read more

Globalized infectious diseases are causing species declines worldwide, but their source often remains elusive. We used whole-genome sequencing to solve the spatiotemporal origins of the most devastating panzootic to date, caused by the fungus Batrachochytrium dendrobatidis, a proximate driver of global amphibian declines. We traced the source of B. dendrobatidis to the Korean peninsula, where one lineage, BdASIA-1, exhibits the genetic hallmarks of an ancestral population that seeded the panzootic. We date the emergence of this pathogen to the early 20th century, coinciding with the global expansion of commercial trade in amphibians, and we show that intercontinental transmission is ongoing. Our findings point to East Asia as a geographic hotspot for B. dendrobatidis biodiversity and the original source of these lineages that now parasitize amphibians worldwide.

Hadfield J, Croucher NJ, Goater RJ, Abudahab K, Aanensen DM, Harris SR. 2017. Phandango: an interactive viewer for bacterial population genomics. Bioinformatics, 34 (2), pp. 292-293. | Show Abstract | Read more

Summary: Fully exploiting the wealth of data in current bacterial population genomics datasets requires synthesising and integrating different types of analysis across millions of base pairs in hundreds or thousands of isolates. Current approaches often use static representations of phylogenetic, epidemiological, statistical and evolutionary analysis results that are difficult to relate to one another. Phandango is an interactive application running in a web browser allowing fast exploration of large-scale population genomics datasets combining the output from multiple genomic analysis methods in an intuitive and interactive manner. Availability: Phandango is a web application freely available for use at www.phandango.net and includes a diverse collection of datasets as examples. Source code together with a detailed wiki page is available on GitHub at https://github.com/jameshadfield/phandango. Contact: jh22@sanger.ac.uk, sh16@sanger.ac.uk.

Otter JA, Doumith M, Davies F, Mookerjee S, Dyakova E, Gilchrist M, Brannigan ET, Bamford K, Galletly T, Donaldson H et al. 2017. Emergence and clonal spread of colistin resistance due to multiple mutational mechanisms in carbapenemase-producing Klebsiella pneumoniae in London. Sci Rep, 7 (1), pp. 12711. | Show Abstract | Read more

Carbapenemase-producing Enterobacteriaceae (CPE) are emerging worldwide, limiting therapeutic options. Mutational and plasmid-mediated mechanisms of colistin resistance have both been reported. The emergence and clonal spread of colistin resistance was analysed in 40 epidemiologically-related NDM-1 carbapenemase producing Klebsiella pneumoniae isolates identified during an outbreak in a group of London hospitals. Isolates from July 2014 to October 2015 were tested for colistin susceptibility using agar dilution, and characterised by whole genome sequencing (WGS). Colistin resistance was detected in 25/38 (65.8%) cases for which colistin susceptibility was tested. WGS found that three potential mechanisms of colistin resistance had emerged separately, two due to different mutations in mgrB, and one due to a mutation in phoQ, with onward transmission of two distinct colistin-resistant variants, resulting in two sub-clones associated with transmission at separate hospitals. A high rate of colistin resistance (66%) emerged over a 10 month period. WGS demonstrated that mutational colistin resistance emerged three times during the outbreak, with transmission of two colistin-resistant variants.

Domman D, Quilici M-L, Dorman MJ, Njamkepo E, Mutreja A, Mather AE, Delgado G, Morales-Espinosa R, Grimont PAD, Lizárraga-Partida ML et al. 2017. Integrated view of Vibrio cholerae in the Americas. Science, 358 (6364), pp. 789-793. | Show Abstract | Read more

Latin America has experienced two of the largest cholera epidemics in modern history; one in 1991 and the other in 2010. However, confusion still surrounds the relationships between globally circulating pandemic Vibrio cholerae clones and local bacterial populations. We used whole-genome sequencing to characterize cholera across the Americas over a 40-year time span. We found that both epidemics were the result of intercontinental introductions of seventh pandemic El Tor V. cholerae and that at least seven lineages local to the Americas are associated with disease that differs epidemiologically from epidemic cholera. Our results consolidate historical accounts of pandemic cholera with data to show the importance of local lineages, presenting an integrated view of cholera that is important to the design of future disease control strategies.

Mostowy RJ, Croucher NJ, De Maio N, Chewapreecha C, Salter SJ, Turner P, Aanensen DM, Bentley SD, Didelot X, Fraser C. 2017. Pneumococcal Capsule Synthesis Locus cps as Evolutionary Hotspot with Potential to Generate Novel Serotypes by Recombination. Mol Biol Evol, 34 (10), pp. 2537-2554. | Show Abstract | Read more

Diversity of the polysaccharide capsule in Streptococcus pneumoniae-main surface antigen and the target of the currently used pneumococcal vaccines-constitutes a major obstacle in eliminating pneumococcal disease. Such diversity is genetically encoded by almost 100 variants of the capsule biosynthesis locus, cps. However, the evolutionary dynamics of the capsule remains not fully understood. Here, using genetic data from 4,519 bacterial isolates, we found cps to be an evolutionary hotspot with elevated substitution and recombination rates. These rates were a consequence of relaxed purifying selection and positive, diversifying selection acting at this locus, supporting the hypothesis that the capsule has an increased potential to generate novel diversity compared with the rest of the genome. Diversifying selection was particularly evident in the region of wzd/wze genes, which are known to regulate capsule expression and hence the bacterium's ability to cause disease. Using a novel, capsule-centered approach, we analyzed the evolutionary history of 12 major serogroups. Such analysis revealed their complex diversification scenarios, which were principally driven by recombination with other serogroups and other streptococci. Patterns of recombinational exchanges between serogroups could not be explained by serotype frequency alone, thus pointing to nonrandom associations between co-colonizing serotypes. Finally, we discovered a previously unobserved mosaic serotype 39X, which was confirmed to carry a viable and structurally novel capsule. Adding to previous discoveries of other mosaic capsules in densely sampled collections, these results emphasize the strong adaptive potential of the bacterium by its ability to generate novel antigenic diversity by recombination.

Reuter S, Török ME, Holden MTG, Reynolds R, Raven KE, Blane B, Donker T, Bentley SD, Aanensen DM, Grundmann H et al. 2017. Corrigendum: Building a genomic framework for prospective MRSA surveillance in the United Kingdom and the Republic of Ireland. Genome Res, 27 (9), pp. 1622. | Read more

Donker T, Reuter S, Scriberras J, Reynolds R, Brown NM, Török ME, James R, Network EOEMR, Aanensen DM, Bentley SD et al. 2017. Population genetic structuring of methicillin-resistant Staphylococcus aureus clone EMRSA-15 within UK reflects patient referral patterns. Microb Genom, 3 (7), pp. e000113. | Show Abstract | Read more

Antibiotic resistance forms a serious threat to the health of hospitalised patients, rendering otherwise treatable bacterial infections potentially life-threatening. A thorough understanding of the mechanisms by which resistance spreads between patients in different hospitals is required in order to design effective control strategies. We measured the differences between bacterial populations of 52 hospitals in the United Kingdom and Ireland, using whole-genome sequences from 1085 MRSA clonal complex 22 isolates collected between 1998 and 2012. The genetic differences between bacterial populations were compared with the number of patients transferred between hospitals and their regional structure. The MRSA populations within single hospitals, regions and countries were genetically distinct from the rest of the bacterial population at each of these levels. Hospitals from the same patient referral regions showed more similar MRSA populations, as did hospitals sharing many patients. Furthermore, the bacterial populations from different time-periods within the same hospital were generally more similar to each other than contemporaneous bacterial populations from different hospitals. We conclude that, while a large part of the dispersal and expansion of MRSA takes place among patients seeking care in single hospitals, inter-hospital spread of resistant bacteria is by no means a rare occurrence. Hospitals are exposed to constant introductions of MRSA on a number of levels: (1) most MRSA is received from hospitals that directly transfer large numbers of patients, while (2) fewer introductions happen between regions or (3) across national borders, reflecting lower numbers of transferred patients. A joint coordinated control effort between hospitals, is therefore paramount for the national control of MRSA, antibiotic-resistant bacteria and other hospital-associated pathogens.

Bayliss SC, Verner-Jeffreys DW, Bartie KL, Aanensen DM, Sheppard SK, Adams A, Feil EJ. 2017. The Promise of Whole Genome Pathogen Sequencing for the Molecular Epidemiology of Emerging Aquaculture Pathogens. Front Microbiol, 8 (FEB), pp. 121. | Show Abstract | Read more

Aquaculture is the fastest growing food-producing sector, and the sustainability of this industry is critical both for global food security and economic welfare. The management of infectious disease represents a key challenge. Here, we discuss the opportunities afforded by whole genome sequencing of bacterial and viral pathogens of aquaculture to mitigate disease emergence and spread. We outline, by way of comparison, how sequencing technology is transforming the molecular epidemiology of pathogens of public health importance, emphasizing the importance of community-oriented databases and analysis tools.

Grundmann H, Glasner C, Albiger B, Aanensen DM, Tomlinson CT, Andrasević AT, Cantón R, Carmeli Y, Friedrich AW, Giske CG et al. 2017. Occurrence of carbapenemase-producing Klebsiella pneumoniae and Escherichia coli in the European survey of carbapenemase-producing Enterobacteriaceae (EuSCAPE): a prospective, multinational study. Lancet Infect Dis, 17 (2), pp. 153-163. | Show Abstract | Read more

BACKGROUND: Gaps in the diagnostic capacity and heterogeneity of national surveillance and reporting standards in Europe make it difficult to contain carbapenemase-producing Enterobacteriaceae. We report the development of a consistent sampling framework and the results of the first structured survey on the occurrence of carbapenemase-producing Klebsiella pneumoniae and Escherichia coli in European hospitals. METHODS: National expert laboratories recruited hospitals with diagnostic capacities, who collected the first ten carbapenem non-susceptible clinical isolates of K pneumoniae or E coli and ten susceptible same-species comparator isolates and pertinent patient and hospital information. Isolates and data were relayed back to national expert laboratories, which made laboratory-substantiated information available for central analysis. FINDINGS: Between Nov 1, 2013, and April 30, 2014, 455 sentinel hospitals in 36 countries submitted 2703 clinical isolates (2301 [85%] K pneumoniae and 402 (15%) E coli). 850 (37%) of 2301 K pneumoniae samples and 77 (19%) of 402 E coli samples were carbapenemase (KPC, NDM, OXA-48-like, or VIM) producers. The ratio of K pneumoniae to E coli was 11:1. 1·3 patients per 10 000 hospital admissions had positive clinical specimens. Prevalence differed greatly, with the highest rates in Mediterranean and Balkan countries. Carbapenemase-producing K pneumoniae isolates showed high resistance to last-line antibiotics. INTERPRETATION: This initiative shows an encouraging commitment by all participants, and suggests that challenges in the establishment of a continent-wide enhanced sentinel surveillance for carbapenemase-producing Enterobacteriaeceae can be overcome. Strengthening infection control efforts in hospitals is crucial for controlling spread through local and national health care networks. FUNDING: European Centre for Disease Prevention and Control.

Argimón S, Aanensen DM. 2016. Species Mash-up. Nat Rev Microbiol, 14 (12), pp. 730. | Read more

Argimón S, Abudahab K, Goater RJE, Fedosejev A, Bhai J, Glasner C, Feil EJ, Holden MTG, Yeats CA, Grundmann H et al. 2016. Microreact: visualizing and sharing data for genomic epidemiology and phylogeography. Microb Genom, 2 (11), pp. e000093. | Show Abstract | Read more

Visualization is frequently used to aid our interpretation of complex datasets. Within microbial genomics, visualizing the relationships between multiple genomes as a tree provides a framework onto which associated data (geographical, temporal, phenotypic and epidemiological) are added to generate hypotheses and to explore the dynamics of the system under investigation. Selected static images are then used within publications to highlight the key findings to a wider audience. However, these images are a very inadequate way of exploring and interpreting the richness of the data. There is, therefore, a need for flexible, interactive software that presents the population genomic outputs and associated data in a user-friendly manner for a wide range of end users, from trained bioinformaticians to front-line epidemiologists and health workers. Here, we present Microreact, a web application for the easy visualization of datasets consisting of any combination of trees, geographical, temporal and associated metadata. Data files can be uploaded to Microreact directly via the web browser or by linking to their location (e.g. from Google Drive/Dropbox or via API), and an integrated visualization via trees, maps, timelines and tables provides interactive querying of the data. The visualization can be shared as a permanent web link among collaborators, or embedded within publications to enable readers to explore and download the data. Microreact can act as an end point for any tool or bioinformatic pipeline that ultimately generates a tree, and provides a simple, yet powerful, visualization method that will aid research and discovery and the open sharing of datasets.

Didelot X, Dordel J, Whittles LK, Collins C, Bilek N, Bishop CJ, White PJ, Aanensen DM, Parkhill J, Bentley SD et al. 2016. Genomic Analysis and Comparison of Two Gonorrhea Outbreaks. MBio, 7 (3), | Show Abstract | Read more

UNLABELLED: Gonorrhea is a sexually transmitted disease causing growing concern, with a substantial increase in reported incidence over the past few years in the United Kingdom and rising levels of resistance to a wide range of antibiotics. Understanding its epidemiology is therefore of major biomedical importance, not only on a population scale but also at the level of direct transmission. However, the molecular typing techniques traditionally used for gonorrhea infections do not provide sufficient resolution to investigate such fine-scale patterns. Here we sequenced the genomes of 237 isolates from two local collections of isolates from Sheffield and London, each of which was resolved into a single type using traditional methods. The two data sets were selected to have different epidemiological properties: the Sheffield data were collected over 6 years from a predominantly heterosexual population, whereas the London data were gathered within half a year and strongly associated with men who have sex with men. Based on contact tracing information between individuals in Sheffield, we found that transmission is associated with a median time to most recent common ancestor of 3.4 months, with an upper bound of 8 months, which we used as a criterion to identify likely transmission links in both data sets. In London, we found that transmission happened predominantly between individuals of similar age, sexual orientation, and location and also with the same HIV serostatus, which may reflect serosorting and associated risk behaviors. Comparison of the two data sets suggests that the London epidemic involved about ten times more cases than the Sheffield outbreak. IMPORTANCE: The recent increases in gonorrhea incidence and antibiotic resistance are cause for public health concern. Successful intervention requires a better understanding of transmission patterns, which is not uncovered by traditional molecular epidemiology techniques. Here we studied two outbreaks that took place in Sheffield and London, United Kingdom. We show that whole-genome sequencing provides the resolution to investigate direct gonorrhea transmission between infected individuals. Combining genome sequencing with rich epidemiological information about infected individuals reveals the importance of several transmission routes and risk factors, which can be used to design better control measures.

Aanensen DM, Feil EJ, Holden MTG, Dordel J, Yeats CA, Fedosejev A, Goater R, Castillo-Ramírez S, Corander J, Colijn C et al. 2016. Whole-Genome Sequencing for Routine Pathogen Surveillance in Public Health: a Population Snapshot of Invasive Staphylococcus aureus in Europe. MBio, 7 (3), | Show Abstract | Read more

UNLABELLED: The implementation of routine whole-genome sequencing (WGS) promises to transform our ability to monitor the emergence and spread of bacterial pathogens. Here we combined WGS data from 308 invasive Staphylococcus aureus isolates corresponding to a pan-European population snapshot, with epidemiological and resistance data. Geospatial visualization of the data is made possible by a generic software tool designed for public health purposes that is available at the project URL (http://www.microreact.org/project/EkUvg9uY?tt=rc). Our analysis demonstrates that high-risk clones can be identified on the basis of population level properties such as clonal relatedness, abundance, and spatial structuring and by inferring virulence and resistance properties on the basis of gene content. We also show that in silico predictions of antibiotic resistance profiles are at least as reliable as phenotypic testing. We argue that this work provides a comprehensive road map illustrating the three vital components for future molecular epidemiological surveillance: (i) large-scale structured surveys, (ii) WGS, and (iii) community-oriented database infrastructure and analysis tools. IMPORTANCE: The spread of antibiotic-resistant bacteria is a public health emergency of global concern, threatening medical intervention at every level of health care delivery. Several recent studies have demonstrated the promise of routine whole-genome sequencing (WGS) of bacterial pathogens for epidemiological surveillance, outbreak detection, and infection control. However, as this technology becomes more widely adopted, the key challenges of generating representative national and international data sets and the development of bioinformatic tools to manage and interpret the data become increasingly pertinent. This study provides a road map for the integration of WGS data into routine pathogen surveillance. We emphasize the importance of large-scale routine surveys to provide the population context for more targeted or localized investigation and the development of open-access bioinformatic tools to provide the means to combine and compare independently generated data with publicly available data sets.

Crellen T, Allan F, David S, Durrant C, Huckvale T, Holroyd N, Emery AM, Rollinson D, Aanensen DM, Berriman M et al. 2016. Whole genome resequencing of the human parasite Schistosoma mansoni reveals population history and effects of selection. Sci Rep, 6 (1), pp. 20954. | Show Abstract | Read more

Schistosoma mansoni is a parasitic fluke that infects millions of people in the developing world. This study presents the first application of population genomics to S. mansoni based on high-coverage resequencing data from 10 global isolates and an isolate of the closely-related Schistosoma rodhaini, which infects rodents. Using population genetic tests, we document genes under directional and balancing selection in S. mansoni that may facilitate adaptation to the human host. Coalescence modeling reveals the speciation of S. mansoni and S. rodhaini as 107.5-147.6KYA, a period which overlaps with the earliest archaeological evidence for fishing in Africa. Our results indicate that S. mansoni originated in East Africa and experienced a decline in effective population size 20-90KYA, before dispersing across the continent during the Holocene. In addition, we find strong evidence that S. mansoni migrated to the New World with the 16-19th Century Atlantic Slave Trade.

Reuter S, Török ME, Holden MTG, Reynolds R, Raven KE, Blane B, Donker T, Bentley SD, Aanensen DM, Grundmann H et al. 2016. Building a genomic framework for prospective MRSA surveillance in the United Kingdom and the Republic of Ireland. Genome Res, 26 (2), pp. 263-270. | Show Abstract | Read more

The correct interpretation of microbial sequencing data applied to surveillance and outbreak investigation depends on accessible genomic databases to provide vital genetic context. Our aim was to construct and describe a United Kingdom MRSA database containing over 1000 methicillin-resistant Staphylococcus aureus (MRSA) genomes drawn from England, Northern Ireland, Wales, Scotland, and the Republic of Ireland over a decade. We sequenced 1013 MRSA submitted to the British Society for Antimicrobial Chemotherapy by 46 laboratories between 2001 and 2010. Each isolate was assigned to a regional healthcare referral network in England and was otherwise grouped based on country of origin. Phylogenetic reconstructions were used to contextualize MRSA outbreak investigations and to detect the spread of resistance. The majority of isolates (n = 783, 77%) belonged to CC22, which contains the dominant United Kingdom epidemic clone (EMRSA-15). There was marked geographic structuring of EMRSA-15, consistent with widespread dissemination prior to the sampling decade followed by local diversification. The addition of MRSA genomes from two outbreaks and one pseudo-outbreak demonstrated the certainty with which outbreaks could be confirmed or refuted. We identified local and regional differences in antibiotic resistance profiles, with examples of local expansion, as well as widespread circulation of mobile genetic elements across the bacterial population. We have generated a resource for the future surveillance and outbreak investigation of MRSA in the United Kingdom and Ireland and have shown the value of this during outbreak investigation and tracking of antimicrobial resistance.

Grundmann H, Schouls LM, Aanensen DM, Pluister GN, Tami A, Chlebowicz M, Glasner C, Sabat AJ, Weist K, Heuer O et al. 2014. The dynamic changes of dominant clones of Staphylococcus aureus causing bloodstream infections in the European region: results of a second structured survey. Euro Surveill, 19 (49), pp. 35-44. | Show Abstract | Read more

Staphylococcus aureus is one of the most important human pathogens and meticillin-resistant S. aureus (MRSA) presents a major cause of healthcare- and community-acquired infections. This study investigated the spatial and temporal changes of S. aureus causing bacteraemia in Europe over a five-year interval and explored the possibility of integrating pathogen-based typing data with epidemiological and clinical information at a European level. Between January 2011 and July 2011, 350 laboratories serving 453 hospitals in 25 countries collected 3,753 isolates (meticillin-sensitive S. aureus (MSSA) and MRSA) from patients with S. aureus bloodstream infections. All isolates were sent to the national staphylococcal reference laboratories and characterised by quality-controlled spa typing. Data were uploaded to an interactive web-based mapping tool. A wide geographical distribution of spa types was found, with some prevalent in all European countries. MSSA was more diverse than MRSA. MRSA differed considerably between countries with major international clones expanding or receding when compared to a 2006 survey. We provide evidence that a network approach of decentralised typing and visualisation of aggregated data using an interactive mapping tool can provide important information on the dynamics of S. aureus populations such as early signalling of emerging strains, cross-border spread and importation by travel.

Aanensen DM, Huntley DM, Menegazzo M, Powell CI, Spratt BG. 2014. EpiCollect+: linking smartphones to web applications for complex data collection projects. F1000Res, 3 pp. 199. | Show Abstract | Read more

Previously, we have described the development of the generic mobile phone data gathering tool, EpiCollect, and an associated web application, providing two-way communication between multiple data gatherers and a project database. This software only allows data collection on the phone using a single questionnaire form that is tailored to the needs of the user (including a single GPS point and photo per entry), whereas many applications require a more complex structure, allowing users to link a series of forms in a linear or branching hierarchy, along with the addition of any number of media types accessible from smartphones and/or tablet devices (e.g., GPS, photos, videos, sound clips and barcode scanning). A much enhanced version of EpiCollect has been developed (EpiCollect+). The individual data collection forms in EpiCollect+ provide more design complexity than the single form used in EpiCollect, and the software allows the generation of complex data collection projects through the ability to link many forms together in a linear (or branching) hierarchy. Furthermore, EpiCollect+ allows the collection of multiple media types as well as standard text fields, increased data validation and form logic. The entire process of setting up a complex mobile phone data collection project to the specification of a user (project and form definitions) can be undertaken at the EpiCollect+ website using a simple 'drag and drop' procedure, with visualisation of the data gathered using Google Maps and charts at the project website. EpiCollect+ is suitable for situations where multiple users transmit complex data by mobile phone (or other Android devices) to a single project web database and is already being used for a range of field projects, particularly public health projects in sub-Saharan Africa. However, many uses can be envisaged from education, ecology and epidemiology to citizen science.

Jombart T, Aanensen DM, Baguelin M, Birrell P, Cauchemez S, Camacho A, Colijn C, Collins C, Cori A, Didelot X et al. 2014. OutbreakTools: a new platform for disease outbreak analysis using the R software. Epidemics, 7 pp. 28-34. | Show Abstract | Read more

The investigation of infectious disease outbreaks relies on the analysis of increasingly complex and diverse data, which offer new prospects for gaining insights into disease transmission processes and informing public health policies. However, the potential of such data can only be harnessed using a number of different, complementary approaches and tools, and a unified platform for the analysis of disease outbreaks is still lacking. In this paper, we present the new R package OutbreakTools, which aims to provide a basis for outbreak data management and analysis in R. OutbreakTools is developed by a community of epidemiologists, statisticians, modellers and bioinformaticians, and implements classes and methods for storing, handling and visualizing outbreak data. It includes real and simulated outbreak datasets. Together with a number of tools for infectious disease epidemiology recently made available in R, OutbreakTools contributes to the emergence of a new, free and open-source platform for the analysis of disease outbreaks.

Chewapreecha C, Harris SR, Croucher NJ, Turner C, Marttinen P, Cheng L, Pessia A, Aanensen DM, Mather AE, Page AJ et al. 2014. Dense genomic sampling identifies highways of pneumococcal recombination. Nat Genet, 46 (3), pp. 305-309. | Show Abstract | Read more

Evasion of clinical interventions by Streptococcus pneumoniae occurs through selection of non-susceptible genomic variants. We report whole-genome sequencing of 3,085 pneumococcal carriage isolates from a 2.4-km(2) refugee camp. This sequencing provides unprecedented resolution of the process of recombination and its impact on population evolution. Genomic recombination hotspots show remarkable consistency between lineages, indicating common selective pressures acting at certain loci, particularly those associated with antibiotic resistance. Temporal changes in antibiotic consumption are reflected in changes in recombination trends, demonstrating rapid spread of resistance when selective pressure is high. The highest frequencies of receipt and donation of recombined DNA fragments were observed in non-encapsulated lineages, implying that this largely overlooked pneumococcal group, which is beyond the reach of current vaccines, may have a major role in genetic exchange and the adaptation of the species as a whole. These findings advance understanding of pneumococcal population dynamics and provide information for the design of future intervention strategies.

Limmathurotsakul D, Wongsuvan G, Aanensen D, Ngamwilai S, Saiprom N, Rongkard P, Thaipadungpanit J, Kanoksil M, Chantratita N, Day NPJ, Peacock SJ. 2014. Melioidosis caused by Burkholderia pseudomallei in drinking water, Thailand, 2012. Emerg Infect Dis, 20 (2), pp. 265-268. | Show Abstract | Read more

We identified 10 patients in Thailand with culture-confirmed melioidosis who had Burkholderia pseudomallei isolated from their drinking water. The multilocus sequence type of B. pseudomallei from clinical specimens and water samples were identical for 2 patients. This finding suggests that drinking water is a preventable source of B. pseudomallei infection.

Zhang L, Thomas JC, Miragaia M, Bouchami O, Chaves F, d'Azevedo PA, Aanensen DM, de Lencastre H, Gray BM, Robinson DA. 2013. Multilocus sequence typing and further genetic characterization of the enigmatic pathogen, Staphylococcus hominis. PLoS One, 8 (6), pp. e66496. | Show Abstract | Read more

Staphylococcus hominis is a commensal resident of human skin and an opportunistic pathogen. The species is subdivided into two subspecies, S. hominis subsp. hominis and S. hominis subsp. novobiosepticus, which are difficult to distinguish. To investigate the evolution and epidemiology of S. hominis, a total of 108 isolates collected from 10 countries over 40 years were characterized by classical phenotypic methods and genetic methods. One nonsynonymous mutation in gyrB, scored with a novel SNP typing assay, had a perfect association with the novobiocin-resistant phenotype. A multilocus sequence typing (MLST) scheme was developed from six housekeeping gene fragments, and revealed relatively high levels of genetic diversity and a significant impact of recombination on S. hominis population structure. Among the 40 sequence types (STs) identified by MLST, three STs (ST2, ST16 and ST23) were S. hominis subsp. novobiosepticus, and they distinguished between isolates from different outbreaks, whereas 37 other STs were S. hominis subsp. hominis, one of which was widely disseminated (ST1). A modified PCR assay was developed to detect the presence of ccrAB4 from the SCCmec genetic element. S. hominis subsp. novobiosepticus isolates were oxacillin-resistant and carriers of specific components of SCCmec (mecA class A, ccrAB3, ccrAB4, ccrC), whereas S. hominis subsp. hominis included both oxacillin-sensitive and -resistant isolates and a more diverse array of SCCmec components. Surprisingly, phylogenetic analyses indicated that S. hominis subsp. novobiosepticus may be a polyphyletic and, hence, artificial taxon. In summary, these results revealed the genetic diversity of S. hominis, the identities of outbreak-causing clones, and the evolutionary relationships between subspecies and clones. The pathogenic lifestyle attributed to S. hominis subsp. novobiosepticus may have originated on more than one occasion.

Cheng L, Connor TR, Sirén J, Aanensen DM, Corander J. 2013. Hierarchical and spatially explicit clustering of DNA sequences with BAPS software. Mol Biol Evol, 30 (5), pp. 1224-1228. | Show Abstract | Read more

Phylogeographical analyses have become commonplace for a myriad of organisms with the advent of cheap DNA sequencing technologies. Bayesian model-based clustering is a powerful tool for detecting important patterns in such data and can be used to decipher even quite subtle signals of systematic differences in molecular variation. Here, we introduce two upgrades to the Bayesian Analysis of Population Structure (BAPS) software, which enable 1) spatially explicit modeling of variation in DNA sequences and 2) hierarchical clustering of DNA sequence data to reveal nested genetic population structures. We provide a direct interface to map the results from spatial clustering with Google Maps using the portal http://www.spatialepidemiology.net/ and illustrate this approach using sequence data from Borrelia burgdorferi. The usefulness of hierarchical clustering is demonstrated through an analysis of the metapopulation structure within a bacterial population experiencing a high level of local horizontal gene transfer. The tools that are introduced are freely available at http://www.helsinki.fi/bsg/software/BAPS/.

Holden MTG, Hsu L-Y, Kurt K, Weinert LA, Mather AE, Harris SR, Strommenger B, Layer F, Witte W, de Lencastre H et al. 2013. A genomic portrait of the emergence, evolution, and global spread of a methicillin-resistant Staphylococcus aureus pandemic. Genome Res, 23 (4), pp. 653-664. | Show Abstract | Read more

The widespread use of antibiotics in association with high-density clinical care has driven the emergence of drug-resistant bacteria that are adapted to thrive in hospitalized patients. Of particular concern are globally disseminated methicillin-resistant Staphylococcus aureus (MRSA) clones that cause outbreaks and epidemics associated with health care. The most rapidly spreading and tenacious health-care-associated clone in Europe currently is EMRSA-15, which was first detected in the UK in the early 1990s and subsequently spread throughout Europe and beyond. Using phylogenomic methods to analyze the genome sequences for 193 S. aureus isolates, we were able to show that the current pandemic population of EMRSA-15 descends from a health-care-associated MRSA epidemic that spread throughout England in the 1980s, which had itself previously emerged from a primarily community-associated methicillin-sensitive population. The emergence of fluoroquinolone resistance in this EMRSA-15 subclone in the English Midlands during the mid-1980s appears to have played a key role in triggering pandemic spread, and occurred shortly after the first clinical trials of this drug. Genome-based coalescence analysis estimated that the population of this subclone over the last 20 yr has grown four times faster than its progenitor. Using comparative genomic analysis we identified the molecular genetic basis of 99.8% of the antimicrobial resistance phenotypes of the isolates, highlighting the potential of pathogen genome sequencing as a diagnostic tool. We document the genetic changes associated with adaptation to the hospital environment and with increasing drug resistance over time, and how MRSA evolution likely has been influenced by country-specific drug use regimens.

Theethakaew C, Feil EJ, Castillo-Ramírez S, Aanensen DM, Suthienkul O, Neil DM, Davies RL. 2013. Genetic relationships of Vibrio parahaemolyticus isolates from clinical, human carrier, and environmental sources in Thailand, determined by multilocus sequence analysis. Appl Environ Microbiol, 79 (7), pp. 2358-2370. | Show Abstract | Read more

Vibrio parahaemolyticus is a seafood-borne pathogenic bacterium that is a major cause of gastroenteritis worldwide. We investigated the genetic and evolutionary relationships of 101 V. parahaemolyticus isolates originating from clinical, human carrier, and various environmental and seafood production sources in Thailand using multilocus sequence analysis. The isolates were recovered from clinical samples (n = 15), healthy human carriers (n = 18), various types of fresh seafood (n = 18), frozen shrimp (n = 16), fresh-farmed shrimp tissue (n = 18), and shrimp farm water (n = 16). Phylogenetic analysis revealed a high degree of genetic diversity within the V. parahaemolyticus population, although isolates recovered from clinical samples and from farmed shrimp and water samples represented distinct clusters. The tight clustering of the clinical isolates suggests that disease-causing isolates are not a random sample of the environmental reservoir, although the source of infection remains unclear. Extensive serotypic diversity occurred among isolates representing the same sequence types and recovered from the same source at the same time. These findings suggest that the O- and K-antigen-encoding loci are subject to exceptionally high rates of recombination. There was also strong evidence of interspecies horizontal gene transfer and intragenic recombination involving the recA locus in a large proportion of isolates. As the majority of the intragenic recombinational exchanges involving recA occurred among clinical and carrier isolates, it is possible that the human intestinal tract serves as a potential reservoir of donor and recipient strains that is promoting horizontal DNA transfer, driving evolutionary change, and leading to the emergence of new, potentially pathogenic strains.

Olson DH, Aanensen DM, Ronnenberg KL, Powell CI, Walker SF, Bielby J, Garner TWJ, Weaver G, Bd Mapping Group, Fisher MC. 2013. Mapping the global emergence of Batrachochytrium dendrobatidis, the amphibian chytrid fungus. PLoS One, 8 (2), pp. e56802. | Show Abstract | Read more

The rapid worldwide emergence of the amphibian pathogen Batrachochytrium dendrobatidis (Bd) is having a profound negative impact on biodiversity. However, global research efforts are fragmented and an overarching synthesis of global infection data is lacking. Here, we provide results from a community tool for the compilation of worldwide Bd presence and report on the analyses of data collated over a four-year period. Using this online database, we analysed: 1) spatial and taxonomic patterns of infection, including amphibian families that appear over- and under-infected; 2) relationships between Bd occurrence and declining amphibian species, including associations among Bd occurrence, species richness, and enigmatic population declines; and 3) patterns of environmental correlates with Bd, including climate metrics for all species combined and three families (Hylidae, Bufonidae, Ranidae) separately, at both a global scale and regional (U.S.A.) scale. These associations provide new insights for downscaled hypothesis testing. The pathogen has been detected in 52 of 82 countries in which sampling was reported, and it has been detected in 516 of 1240 (42%) amphibian species. We show that detected Bd infections are related to amphibian biodiversity and locations experiencing rapid enigmatic declines, supporting the hypothesis that greater complexity of amphibian communities increases the likelihood of emergence of infection and transmission of Bd. Using a global model including all sampled species, the odds of Bd detection decreased with increasing temperature range at a site. Further consideration of temperature range, rather than maximum or minimum temperatures, may provide new insights into Bd-host ecology. Whereas caution is necessary when interpreting such a broad global dataset, the use of our pathogen database is helping to inform studies of the epidemiology of Bd, as well as enabling regional, national, and international prioritization of conservation efforts. We provide recommendations for adaptive management to enhance the database utility and relevance.

Boonsilp S, Thaipadungpanit J, Amornchai P, Wuthiekanun V, Bailey MS, Holden MTG, Zhang C, Jiang X, Koizumi N, Taylor K et al. 2013. A single multilocus sequence typing (MLST) scheme for seven pathogenic Leptospira species. PLoS Negl Trop Dis, 7 (1), pp. e1954. | Show Abstract | Read more

BACKGROUND: The available Leptospira multilocus sequence typing (MLST) scheme supported by a MLST website is limited to L. interrogans and L. kirschneri. Our aim was to broaden the utility of this scheme to incorporate a total of seven pathogenic species. METHODOLOGY AND FINDINGS: We modified the existing scheme by replacing one of the seven MLST loci (fadD was changed to caiB), as the former gene did not appear to be present in some pathogenic species. Comparison of the original and modified schemes using data for L. interrogans and L. kirschneri demonstrated that the discriminatory power of the two schemes was not significantly different. The modified scheme was used to further characterize 325 isolates (L. alexanderi [n = 5], L. borgpetersenii [n = 34], L. interrogans [n = 222], L. kirschneri [n = 29], L. noguchii [n = 9], L. santarosai [n = 10], and L. weilii [n = 16]). Phylogenetic analysis using concatenated sequences of the 7 loci demonstrated that each species corresponded to a discrete clade, and that no strains were misclassified at the species level. Comparison between genotype and serovar was possible for 254 isolates. Of the 31 sequence types (STs) represented by at least two isolates, 18 STs included isolates assigned to two or three different serovars. Conversely, 14 serovars were identified that contained between 2 to 10 different STs. New observations were made on the global phylogeography of Leptospira spp., and the utility of MLST in making associations between human disease and specific maintenance hosts was demonstrated. CONCLUSION: The new MLST scheme, supported by an updated MLST website, allows the characterization and species assignment of isolates of the seven major pathogenic species associated with leptospirosis.

Cornelius DC, Robinson DA, Muzny CA, Mena LA, Aanensen DM, Lushbaugh WB, Meade JC. 2012. Genetic characterization of Trichomonas vaginalis isolates by use of multilocus sequence typing. J Clin Microbiol, 50 (10), pp. 3293-3300. | Show Abstract | Read more

In this study, we introduce a multilocus sequence typing (MLST) scheme, comprised of seven single-copy housekeeping genes, to genetically characterize Trichomonas vaginalis. Sixty-eight historical and recent isolates of T. vaginalis were sampled from the American Type Culture Collection and female patients at area health care facilities, respectively, to assess the usefulness of this typing method. Forty-three polymorphic nucleotide sites, 51 different alleles, and 60 sequence types were distinguished among the 68 isolates, revealing a diverse T. vaginalis population. Moreover, this discriminatory MLST scheme retains the ability to identify epidemiologically linked isolates such as those collected from sexual partners. Population genetic and phylogenetic analyses determined that T. vaginalis population structure is strongly influenced by recombination and is composed of two separate populations that may be nonclonal. MLST is useful for investigating the epidemiology, genetic diversity, and population structure of T. vaginalis.

McAdam PR, Templeton KE, Edwards GF, Holden MTG, Feil EJ, Aanensen DM, Bargawi HJA, Spratt BG, Bentley SD, Parkhill J et al. 2012. Molecular tracing of the emergence, adaptation, and transmission of hospital-associated methicillin-resistant Staphylococcus aureus. Proc Natl Acad Sci U S A, 109 (23), pp. 9107-9112. | Show Abstract | Read more

Hospital-associated infections caused by methicillin-resistant Staphylococcus aureus (MRSA) are a global health burden dominated by a small number of bacterial clones. The pandemic EMRSA-16 clone (ST36-II) has been widespread in UK hospitals for 20 y, but its evolutionary origin and the molecular basis for its hospital association are unclear. We carried out a Bayesian phylogenetic reconstruction on the basis of the genome sequences of 87 S. aureus isolates including 60 EMRSA-16 and 27 additional clonal complex 30 (CC30) isolates, collected from patients in three continents over a 53-y period. The three major pandemic clones to originate from the CC30 lineage, including phage type 80/81, Southwest Pacific, and EMRSA-16, shared a most recent common ancestor that existed over 100 y ago, whereas the hospital-associated EMRSA-16 clone is estimated to have emerged about 35 y ago. Our CC30 genome-wide analysis revealed striking molecular correlates of hospital- or community-associated pandemics represented by mobile genetic elements and nonsynonymous mutations affecting antibiotic resistance and virulence. Importantly, phylogeographic analysis indicates that EMRSA-16 spread within the United Kingdom by transmission from hospitals in large population centers in London and Glasgow to regional health-care settings, implicating patient referrals as an important cause of nationwide transmission. Taken together, the high-resolution phylogenomic approach used resulted in a unique understanding of the emergence and transmission of a major MRSA clone and provided molecular correlates of its hospital adaptation. Similar approaches for hospital-associated clones of other bacterial pathogens may inform appropriate measures for controlling their intra- and interhospital spread.

Ahmed A, Thaipadungpanit J, Boonsilp S, Wuthiekanun V, Nalam K, Spratt BG, Aanensen DM, Smythe LD, Ahmed N, Feil EJ et al. 2011. Comparison of two multilocus sequence based genotyping schemes for Leptospira species. PLoS Negl Trop Dis, 5 (11), pp. e1374. | Show Abstract | Read more

BACKGROUND: Several sequence based genotyping schemes have been developed for Leptospira spp. The objective of this study was to genotype a collection of clinical and reference isolates using the two most commonly used schemes and compare and contrast the results. METHODS AND FINDINGS: A total of 48 isolates consisting of L. interrogans (n = 40) and L. kirschneri (n = 8) were typed by the 7 locus MLST scheme described by Thaipadungpanit et al., and the 6 locus genotyping scheme described by Ahmed et al., (termed 7L and 6L, respectively). Two L. interrogans isolates were not typed using 6L because of a deletion of three nucleotides in lipL32. The remaining 46 isolates were resolved into 21 sequence types (STs) by 7L, and 30 genotypes by 6L. Overall nucleotide diversity (based on concatenated sequence) was 3.6% and 2.3% for 7L and 6L, respectively. The D value (discriminatory ability) of 7L and 6L were comparable, i.e. 92.0 (95% CI 87.5-96.5) vs. 93.5 (95% CI 88.6-98.4). The dN/dS ratios calculated for each locus indicated that none were under positive selection. Neighbor joining trees were reconstructed based on the concatenated sequences for each scheme. Both trees showed two distinct groups corresponding to L. interrogans and L. kirschneri, and both identified two clones containing 10 and 7 clinical isolates, respectively. There were six instances in which 6L split single STs as defined by 7L into closely related clusters. We noted two discrepancies between the trees in which the genetic relatedness between two pairs of strains were more closely related by 7L than by 6L. CONCLUSIONS: This genetic analysis indicates that the two schemes are comparable. We discuss their practical advantages and disadvantages.

Cheng L, Connor TR, Aanensen DM, Spratt BG, Corander J. 2011. Bayesian semi-supervised classification of bacterial samples using MLST databases. BMC Bioinformatics, 12 (1), pp. 302. | Show Abstract | Read more

BACKGROUND: Worldwide effort on sampling and characterization of molecular variation within a large number of human and animal pathogens has lead to the emergence of multi-locus sequence typing (MLST) databases as an important tool for studying the epidemiology and evolution of pathogens. Many of these databases are currently harboring several thousands of multi-locus DNA sequence types (STs) enriched with metadata over traits such as serotype, antibiotic resistance, host organism etc of the isolates. Curators of the databases have thus the possibility of dividing the pathogen populations into subsets representing different evolutionary lineages, geographically associated groups, or other subpopulations, which are defined in terms of molecular similarities and dissimilarities residing within a database. When combined with the existing metadata, such subsets may provide invaluable information for assessing the position of a new set of isolates in relation to the whole pathogen population. RESULTS: To enable users of MLST schemes to query the databases with sets of new bacterial isolates and to automatically analyze their relation to existing curated sequences, we introduce here a Bayesian model-based method for semi-supervised classification of MLST data. Our method can use an MLST database as a training set and assign simultaneously any set of query sequences into the earlier discovered lineages/populations, while also allowing some or all of these sequences to form previously undiscovered genetically distinct groups. This tool provides probabilistic quantification of the classification uncertainty and is highly efficient computationally, thus enabling rapid analyses of large databases and sets of query sequences. The latter feature is a necessary prerequisite for an automated access through the MLST web interface. We demonstrate the versatility of our approach by anayzing both real and synthesized data from MLST databases. The introduced method for semi-supervised classification of sets of query STs is freely available for Windows, Mac OS X and Linux operative systems in BAPS 5.4 software which is downloadable at http://web.abo.fi/fak/mnf/mate/jc/software/baps.html. The query functionality is also directly available for the Staphylococcus aureus database at http://www.mlst.net and shortly will be available for other species databases hosted at this web portal. CONCLUSIONS: We have introduced a model-based tool for automated semi-supervised classification of new pathogen samples that can be integrated into the web interface of the MLST databases. In particular, when combined with the existing metadata, the semi-supervised labeling may provide invaluable information for assessing the position of a new set of query strains in relation to the particular pathogen population represented by the curated database.Such information will be useful both for clinical and basic research purposes.

Chaloner GL, Harrison TG, Coyne KP, Aanensen DM, Birtles RJ. 2011. Multilocus sequence typing of Bartonella henselae in the United Kingdom indicates that only a few, uncommon sequence types are associated with zoonotic disease. J Clin Microbiol, 49 (6), pp. 2132-2137. | Show Abstract | Read more

Bartonella henselae is one of the most common zoonotic agents acquired from companion animals (cats) in industrialized countries. Nonetheless, although the prevalence of infections in cats is high, the number of human cases reported is relatively low. One hypothesis for this discrepancy is that B. henselae strains vary in their zoonotic potential. To test this hypothesis, we employed structured sampling to explore the population structure of B. henselae in the United Kingdom and to determine the distribution of strains associated with zoonotic disease within this structure. A total of 118 B. henselae strains were delineated into 12 sequence types (STs) using multilocus sequence typing. We observed that most (85%) of the zoonosis-associated strains belonged to only three genotypes, i.e., ST2, ST5, and ST8. Conversely, most (74%) of the feline isolates belonged to ST4, ST6, and ST7. The difference in host association of ST2, ST5, and ST8 (zoonosis associated) and ST6 (feline) was statistically significant (P < 0.05), indicating that a few, uncommon STs were responsible for the majority of symptomatic human infections.

Ogden NH, Margos G, Aanensen DM, Drebot MA, Feil EJ, Hanincová K, Schwartz I, Tyler S, Lindsay LR. 2011. Investigation of genotypes of Borrelia burgdorferi in Ixodes scapularis ticks collected during surveillance in Canada. Appl Environ Microbiol, 77 (10), pp. 3244-3254. | Show Abstract | Read more

The genetic diversity of Borrelia burgdorferi sensu stricto, the agent of Lyme disease in North America, has consequences for the performance of serological diagnostic tests and disease severity. To investigate B. burgdorferi diversity in Canada, where Lyme disease is emerging, bacterial DNA in 309 infected adult Ixodes scapularis ticks collected in surveillance was characterized by multilocus sequence typing (MLST) and analysis of outer surface protein C gene (ospC) alleles. Six ticks carried Borrelia miyamotoi, and one tick carried the novel species Borrelia kurtenbachii. 142 ticks carried B. burgdorferi sequence types (STs) previously described from the United States. Fifty-eight ticks carried B. burgdorferi of 1 of 19 novel or undescribed STs, which were single-, double-, or triple-locus variants of STs first described in the United States. Clonal complexes with founder STs from the United States were identified. Seventeen ospC alleles were identified in 309 B. burgdorferi-infected ticks. Positive and negative associations in the occurrence of different alleles in the same tick supported a hypothesis of multiple-niche polymorphism for B. burgdorferi in North America. Geographic analysis of STs and ospC alleles were consistent with south-to-north dispersion of infected ticks from U.S. sources on migratory birds. These observations suggest that the genetic diversity of B. burgdorferi in eastern and central Canada corresponds to that in the United States, but there was evidence for founder events skewing the diversity in emerging tick populations. Further studies are needed to investigate the significance of these observations for the performance of diagnostic tests and clinical presentation of Lyme disease in Canada.

Simwami SP, Khayhan K, Henk DA, Aanensen DM, Boekhout T, Hagen F, Brouwer AE, Harrison TS, Donnelly CA, Fisher MC. 2011. Low diversity Cryptococcus neoformans variety grubii multilocus sequence types from Thailand are consistent with an ancestral African origin. PLoS Pathog, 7 (4), pp. e1001343. | Show Abstract | Read more

The global burden of HIV-associated cryptococcal meningitis is estimated at nearly one million cases per year, causing up to a third of all AIDS-related deaths. Molecular epidemiology constitutes the main methodology for understanding the factors underpinning the emergence of this understudied, yet increasingly important, group of pathogenic fungi. Cryptococcus species are notable in the degree that virulence differs amongst lineages, and highly-virulent emerging lineages are changing patterns of human disease both temporally and spatially. Cryptococcus neoformans variety grubii (Cng, serotype A) constitutes the most ubiquitous cause of cryptococcal meningitis worldwide, however patterns of molecular diversity are understudied across some regions experiencing significant burdens of disease. We compared 183 clinical and environmental isolates of Cng from one such region, Thailand, Southeast Asia, against a global MLST database of 77 Cng isolates. Population genetic analyses showed that Thailand isolates from 11 provinces were highly homogenous, consisting of the same genetic background (globally known as VNI) and exhibiting only ten nearly identical sequence types (STs), with three (STs 44, 45 and 46) dominating our sample. This population contains significantly less diversity when compared against the global population of Cng, specifically Africa. Genetic diversity in Cng was significantly subdivided at the continental level with nearly half (47%) of the global STs unique to a genetically diverse and recombining population in Botswana. These patterns of diversity, when combined with evidence from haplotypic networks and coalescent analyses of global populations, are highly suggestive of an expansion of the Cng VNI clade out of Africa, leading to a limited number of genotypes founding the Asian populations. Divergence time testing estimates the time to the most common ancestor between the African and Asian populations to be 6,920 years ago (95% HPD 122.96 - 27,177.76). Further high-density sampling of global Cng STs is now necessary to resolve the temporal sequence underlying the global emergence of this human pathogen.

Vollmer SA, Bormane A, Dinnis RE, Seelig F, Dobson ADM, Aanensen DM, James MC, Donaghy M, Randolph SE, Feil EJ et al. 2011. Host migration impacts on the phylogeography of Lyme Borreliosis spirochaete species in Europe. Environ Microbiol, 13 (1), pp. 184-192. | Show Abstract | Read more

The geographic patterns of transmission opportunities of vector-borne zoonoses are determined by a complex interplay between the migration patterns of the host and the vector. Here we examine the impact of host migration on the spread of a tick-borne zoonotic disease, using Lyme Borreliosis (LB) spirochaetal species in Europe. We demonstrate that the migration of the LB species is dependent on and limited by the migration of their respective hosts. We note that populations of Borrelia spp. associated with birds (Borrelia garinii and B. valaisiana) show limited geographic structuring between countries compared with those associated with small mammals (Borrelia afzelii), and we argue that this can be explained by higher rates of migration in avian hosts. We also show the presence of B. afzelii strains in England and, through the use of the multi-locus sequence analysis scheme, reveal that the strains are highly structured. This pattern in English sites is very different from that observed at the continental sites, and we propose that these may be recent introductions.

Margos G, Hojgaard A, Lane RS, Cornet M, Fingerle V, Rudenko N, Ogden N, Aanensen DM, Fish D, Piesman J. 2010. Multilocus sequence analysis of Borrelia bissettii strains from North America reveals a new Borrelia species, Borrelia kurtenbachii. Ticks Tick Borne Dis, 1 (4), pp. 151-158. | Show Abstract | Read more

Using multilocus sequence analyses (MLSA), we investigated the phylogenetic relationship of spirochaete strains from North America previously assigned to the genospecies Borrelia bissettii. We amplified internal fragments of 8 housekeeping genes (clpA, clpX, nifS, pepX, pyrG, recG, rplB, and uvrA) located on the main linear chromosome by polymerase chain reaction. Phylogenetic analysis of concatenated sequences of the 8 loci showed that the B. bissettii clade consisted of 4 closely related clusters which included strains from California (including the type strain DN127-Cl9-2/p7) and Colorado that were isolated from Ixodes pacificus, I. spinipalpis, or infected reservoir hosts. Several strains isolated from I. scapularis clustered distantly from B. bissettii. Genetic distance analyses confirmed that these strains are more distant to B. bissettii than they are to B. carolinensis, a recently described Borrelia species, which suggests that they constitute a new Borrelia genospecies. We propose that it be named Borrelia kurtenbachii sp. nov. in honour of the late Klaus Kurtenbach. The data suggest that ecological differences between B. bissettii and the new Borrelia genospecies reflect different transmission cycles. In view of these findings, the distinct vertebrate host-tick vector associations and the distributions of B. bissettii and B. kurtenbachii require further investigation.

Grundmann H, Aanensen DM, van den Wijngaard CC, Spratt BG, Harmsen D, Friedrich AW, European Staphylococcal Reference Laboratory Working Group. 2010. Geographic distribution of Staphylococcus aureus causing invasive infections in Europe: a molecular-epidemiological analysis. PLoS Med, 7 (1), pp. e1000215. | Show Abstract | Read more

BACKGROUND: Staphylococcus aureus is one of the most important human pathogens and methicillin-resistant variants (MRSAs) are a major cause of hospital and community-acquired infection. We aimed to map the geographic distribution of the dominant clones that cause invasive infections in Europe. METHODS AND FINDINGS: In each country, staphylococcal reference laboratories secured the participation of a sufficient number of hospital laboratories to achieve national geo-demographic representation. Participating laboratories collected successive methicillin-susceptible (MSSA) and MRSA isolates from patients with invasive S. aureus infection using an agreed protocol. All isolates were sent to the respective national reference laboratories and characterised by quality-controlled sequence typing of the variable region of the staphylococcal spa gene (spa typing), and data were uploaded to a central database. Relevant genetic and phenotypic information was assembled for interactive interrogation by a purpose-built Web-based mapping application. Between September 2006 and February 2007, 357 laboratories serving 450 hospitals in 26 countries collected 2,890 MSSA and MRSA isolates from patients with invasive S. aureus infection. A wide geographical distribution of spa types was found with some prevalent in all European countries. MSSA were more diverse than MRSA. Genetic diversity of MRSA differed considerably between countries with dominant MRSA spa types forming distinctive geographical clusters. We provide evidence that a network approach consisting of decentralised typing and visualisation of aggregated data using an interactive mapping tool can provide important information on the dynamics of MRSA populations such as early signalling of emerging strains, cross border spread, and importation by travel. CONCLUSIONS: In contrast to MSSA, MRSA spa types have a predominantly regional distribution in Europe. This finding is indicative of the selection and spread of a limited number of clones within health care networks, suggesting that control efforts aimed at interrupting the spread within and between health care institutions may not only be feasible but ultimately successful and should therefore be strongly encouraged.

Meyer W, Aanensen DM, Boekhout T, Cogliati M, Diaz MR, Esposto MC, Fisher M, Gilgado F, Hagen F, Kaocharoen S et al. 2009. Consensus multi-locus sequence typing scheme for Cryptococcus neoformans and Cryptococcus gattii. Med Mycol, 47 (6), pp. 561-570. | Show Abstract | Read more

This communication describes the consensus multi-locus typing scheme established by the Cryptococcal Working Group I (Genotyping of Cryptococcus neoformans and C. gattii) of the International Society for Human and Animal Mycology (ISHAM) using seven unlinked genetic loci for global strain genotyping. These genetic loci include the housekeeping genes CAP59,GPD1, LAC1, PLB1, SOD1, URA5 and the IGS1 region. Allele and sequence type information are accessible at http://www.mlst.net/ .

Aanensen DM, Huntley DM, Feil EJ, al-Own F, Spratt BG. 2009. EpiCollect: linking smartphones to web applications for epidemiology, ecology and community data collection. PLoS One, 4 (9), pp. e6968. | Show Abstract | Read more

BACKGROUND: Epidemiologists and ecologists often collect data in the field and, on returning to their laboratory, enter their data into a database for further analysis. The recent introduction of mobile phones that utilise the open source Android operating system, and which include (among other features) both GPS and Google Maps, provide new opportunities for developing mobile phone applications, which in conjunction with web applications, allow two-way communication between field workers and their project databases. METHODOLOGY: Here we describe a generic framework, consisting of mobile phone software, EpiCollect, and a web application located within www.spatialepidemiology.net. Data collected by multiple field workers can be submitted by phone, together with GPS data, to a common web database and can be displayed and analysed, along with previously collected data, using Google Maps (or Google Earth). Similarly, data from the web database can be requested and displayed on the mobile phone, again using Google Maps. Data filtering options allow the display of data submitted by the individual field workers or, for example, those data within certain values of a measured variable or a time period. CONCLUSIONS: Data collection frameworks utilising mobile phones with data submission to and from central databases are widely applicable and can give a field worker similar display and analysis tools on their mobile phone that they would have if viewing the data in their laboratory via the web. We demonstrate their utility for epidemiological data collection and display, and briefly discuss their application in ecological and community data collection. Furthermore, such frameworks offer great potential for recruiting 'citizen scientists' to contribute data easily to central databases through their mobile phone.

Hanage WP, Aanensen DM. 2009. Methods for data analysis. Methods Mol Biol, 551 pp. 287-304. | Show Abstract | Read more

The molecular epidemiology of infectious diseases uses a variety of techniques to assay the relatedness of disease-causing organisms to identify strains responsible for outbreaks or associated with particular phenotypes of interest (such as antibiotic resistance) and, it is hoped, provide insights into where and how these strains have emerged. The correct analysis of such data requires that we understand how the assayed variation accumulates. We discuss this with specific reference to three classes of methods: those based on gel electrophoresis of fragments generated by restriction enzymes or polymerase chain reaction (PCR), those based on microsatellites and other repeat elements, and raw sequence data from protein-coding genes. We also provide a simple example of how the likely origin of an apparently novel antibiotic-resistant strain may be identified and conclude with a discussion of some popular analysis packages and the more interesting prospects for the future in this rapidly developing field.

Holden MTG, Heather Z, Paillot R, Steward KF, Webb K, Ainslie F, Jourdan T, Bason NC, Holroyd NE, Mungall K et al. 2009. Genomic evidence for the evolution of Streptococcus equi: host restriction, increased virulence, and genetic exchange with human pathogens. PLoS Pathog, 5 (3), pp. e1000346. | Show Abstract | Read more

The continued evolution of bacterial pathogens has major implications for both human and animal disease, but the exchange of genetic material between host-restricted pathogens is rarely considered. Streptococcus equi subspecies equi (S. equi) is a host-restricted pathogen of horses that has evolved from the zoonotic pathogen Streptococcus equi subspecies zooepidemicus (S. zooepidemicus). These pathogens share approximately 80% genome sequence identity with the important human pathogen Streptococcus pyogenes. We sequenced and compared the genomes of S. equi 4047 and S. zooepidemicus H70 and screened S. equi and S. zooepidemicus strains from around the world to uncover evidence of the genetic events that have shaped the evolution of the S. equi genome and led to its emergence as a host-restricted pathogen. Our analysis provides evidence of functional loss due to mutation and deletion, coupled with pathogenic specialization through the acquisition of bacteriophage encoding a phospholipase A(2) toxin, and four superantigens, and an integrative conjugative element carrying a novel iron acquisition system with similarity to the high pathogenicity island of Yersinia pestis. We also highlight that S. equi, S. zooepidemicus, and S. pyogenes share a common phage pool that enhances cross-species pathogen evolution. We conclude that the complex interplay of functional loss, pathogenic specialization, and genetic exchange between S. equi, S. zooepidemicus, and S. pyogenes continues to influence the evolution of these important streptococci.

Bishop CJ, Aanensen DM, Jordan GE, Kilian M, Hanage WP, Spratt BG. 2009. Assigning strains to bacterial species via the internet. BMC Biol, 7 (1), pp. 3. | Show Abstract | Read more

BACKGROUND: Methods for assigning strains to bacterial species are cumbersome and no longer fit for purpose. The concatenated sequences of multiple house-keeping genes have been shown to be able to define and circumscribe bacterial species as sequence clusters. The advantage of this approach (multilocus sequence analysis; MLSA) is that, for any group of related species, a strain database can be produced and combined with software that allows query strains to be assigned to species via the internet. As an exemplar of this approach, we have studied a group of species, the viridans streptococci, which are very difficult to assign to species using standard taxonomic procedures, and have developed a website that allows species assignment via the internet. RESULTS: Seven house-keeping gene sequences were obtained from 420 streptococcal strains to produce a viridans group database. The reference tree produced using the concatenated sequences identified sequence clusters which, by examining the position on the tree of the type strain of each viridans group species, could be equated with species clusters. MLSA also identified clusters that may correspond to new species, and previously described species whose status needs to be re-examined. A generic website and software for electronic taxonomy was developed. This site http://www.eMLSA.net allows the sequences of the seven gene fragments of a query strain to be entered and for the species assignment to be returned, according to its position within an assigned species cluster on the reference tree. CONCLUSION: The MLSA approach resulted in the identification of well-resolved species clusters within this taxonomically challenging group and, using the software we have developed, allows unknown strains to be assigned to viridans species via the internet. Submission of new strains will provide a growing resource for the taxonomy of viridans group streptococci, allowing the recognition of potential new species and taxonomic anomalies. More generally, as the software at the MLSA website is generic, MLSA schemes and strain databases for other groups of related species can be hosted at this website, providing a portal for microbial electronic taxonomy.

Margos G, Gatewood AG, Aanensen DM, Hanincová K, Terekhova D, Vollmer SA, Cornet M, Piesman J, Donaghy M, Bormane A et al. 2008. MLST of housekeeping genes captures geographic population structure and suggests a European origin of Borrelia burgdorferi. Proc Natl Acad Sci U S A, 105 (25), pp. 8730-8735. | Show Abstract | Read more

Lyme borreliosis, caused by the tick-borne bacterium Borrelia burgdorferi, has become the most common vector-borne disease in North America over the last three decades. To understand the dynamics of the epizootic spread and to predict the evolutionary trajectories of B. burgdorferi, accurate information on the population structure and the evolutionary relationships of the pathogen is crucial. We, therefore, developed a multilocus sequence typing (MLST) scheme for B. burgdorferi based on eight chromosomal housekeeping genes. We validated the MLST scheme on B. burgdorferi specimens from North America and Europe, comprising both cultured isolates and infected ticks. These data were compared with sequences for the commonly used genetic markers rrs-rrlA intergenic spacer (IGS) and the gene encoding the outer surface protein C (ospC). The study demonstrates that the concatenated sequences of the housekeeping genes of B. burgdorferi provide highly resolved phylogenetic signals and that the housekeeping genes evolve differently compared with the IGS locus and ospC. Using sequence data, the study reveals that North American and European populations of B. burgdorferi correspond to genetically distinct populations. Importantly, the MLST data suggest that B. burgdorferi originated in Europe rather than in North America as proposed previously.

Mavroidi A, Aanensen DM, Godoy D, Skovsted IC, Kaltoft MS, Reeves PR, Bentley SD, Spratt BG. 2007. Genetic relatedness of the Streptococcus pneumoniae capsular biosynthetic loci. J Bacteriol, 189 (21), pp. 7841-7855. | Show Abstract | Read more

Streptococcus pneumoniae (the pneumococcus) produces 1 of 91 capsular polysaccharides (CPS) that define the serotype. The cps loci of 88 pneumococcal serotypes whose CPS is synthesized by the Wzy-dependent pathway were compared with each other and with additional streptococcal polysaccharide biosynthetic loci and were clustered according to the proportion of shared homology groups (HGs), weighted for the sequence similarities between the genes encoding the shared HGs. The cps loci of the 88 pneumococcal serotypes were distributed into eight major clusters and 21 subclusters. All serotypes within the same serogroup fell into the same major cluster, but in six cases, serotypes within the same serogroup were in different subclusters and, conversely, nine subclusters included completely different serotypes. The closely related cps loci within a subcluster were compared to the known CPS structures to relate gene content to structure. The Streptococcus oralis and Streptococcus mitis polysaccharide biosynthetic loci clustered within the pneumococcal cps loci and were in a subcluster that also included the cps locus of pneumococcal serotype 21, whereas the Streptococcus agalactiae cps loci formed a single cluster that was not closely related to any of the pneumococcal cps clusters.

Aanensen DM, Mavroidi A, Bentley SD, Reeves PR, Spratt BG. 2007. Predicted functions and linkage specificities of the products of the Streptococcus pneumoniae capsular biosynthetic loci. J Bacteriol, 189 (21), pp. 7856-7876. | Show Abstract | Read more

The sequences of the capsular biosynthetic (cps) loci of 90 serotypes of Streptococcus pneumoniae have recently been determined. Bioinformatic procedures were used to predict the general functions of 1,973 of the 1,999 gene products and to identify proteins within the same homology group, Pfam family, and CAZy glycosyltransferase family. Correlating cps gene content with the 54 known capsular polysaccharide (CPS) structures provided tentative assignments of the specific functions of the different homology groups of each functional class (regulatory proteins, enzymes for synthesis of CPS constituents, polymerases, flippases, initial sugar transferases, glycosyltransferases [GTs], phosphotransferases, acetyltransferases, and pyruvyltransferases). Assignment of the glycosidic linkages catalyzed by the 342 GTs (92 homology groups) is problematic, but tentative assignments could be made by using this large set of cps loci and CPS structures to correlate the presence of particular GTs with specific glycosidic linkages, by correlating inverting or retaining linkages in CPS repeat units with the inverting or retaining mechanisms of the GTs predicted from their CAZy family membership, and by comparing the CPS structures of serotypes that have very similar cps gene contents. These large-scale comparisons between structure and gene content assigned the linkages catalyzed by 72% of the GTs, and all linkages were assigned in 32 of the serotypes with known repeat unit structures. Clear examples where very similar initial sugar transferases or glycosyltransferases catalyze different linkages in different serotypes were also identified. These assignments should provide a stimulus for biochemical studies to evaluate the reactions that are proposed.

Abbott JC, Aanensen DM, Bentley SD. 2007. WebACT: an online genome comparison suite. Methods Mol Biol, 395 pp. 57-74. | Show Abstract | Read more

Comparison of related genomes is an enormously powerful technique for explaining phenotypic differences and revealing recent evolutionary events. Genomes evolve through a host of mechanisms including long- and short-range intragenomic rearrangements, insertion of laterally acquired DNA, gene loss, and single-nucleotide polymorphisms. The Artemis Comparison Tool (ACT) was developed to enable the intuitive visualization of the consequences of such events in the context of two or more aligned genomes. WebACT is an online resource designed to allow the alignment of up to five genomic sequences within the ACT environment without the need for local software installation. Comparisons can be carried out between uploaded sequences, or those selected from the EMBL or RefSeq databases, using BLASTZ, MUMmer, or Basic Local Alignment Search Tool (BLAST). Precomputed comparisons can be selected from a database covering all the completed bacterial chromosome and plasmid sequences in the Genome Reviews database (1). This allows the rapid visualization of regions of interest, without the need to handle the full genome sequences. Here, we describe the process of using WebACT to prepare comparisons for visualization, and the selection of precomputed comparisons from the database. The use of ACT to view the selected comparison is then explored using examples from bacterial genomes.

Bentley SD, Aanensen DM, Mavroidi A, Saunders D, Rabbinowitsch E, Collins M, Donohoe K, Harris D, Murphy L, Quail MA et al. 2006. Genetic analysis of the capsular biosynthetic locus from all 90 pneumococcal serotypes. PLoS Genet, 2 (3), pp. e31. | Show Abstract | Read more

Several major invasive bacterial pathogens are encapsulated. Expression of a polysaccharide capsule is essential for survival in the blood, and thus for virulence, but also is a target for host antibodies and the basis for effective vaccines. Encapsulated species typically exhibit antigenic variation and express one of a number of immunochemically distinct capsular polysaccharides that define serotypes. We provide the sequences of the capsular biosynthetic genes of all 90 serotypes of Streptococcus pneumoniae and relate these to the known polysaccharide structures and patterns of immunological reactivity of typing sera, thereby providing the most complete understanding of the genetics and origins of bacterial polysaccharide diversity, laying the foundations for molecular serotyping. This is the first time, to our knowledge, that a complete repertoire of capsular biosynthetic genes has been available, enabling a holistic analysis of a bacterial polysaccharide biosynthesis system. Remarkably, the total size of alternative coding DNA at this one locus exceeds 1.8 Mbp, almost equivalent to the entire S. pneumoniae chromosomal complement.

Abbott JC, Aanensen DM, Rutherford K, Butcher S, Spratt BG. 2005. WebACT--an online companion for the Artemis Comparison Tool. Bioinformatics, 21 (18), pp. 3665-3666. | Show Abstract | Read more

UNLABELLED: WebACT is an online resource which enables the rapid provision of simultaneous BLAST comparisons between up to five genomic sequences in a format amenable for visualization with the well-known Artemis Comparison Tool (ACT). Comparisons can be generated on-the-fly using sequences directly retrieved via EMBL database queries, or by entering or uploading user sequences. Furthermore, pre-computed comparisons are available between all publicly available, completed prokaryotic genomes and plasmids currently contained within the Genome Reviews database (372 sequences, representing 175 different species). The system is designed to minimize the volume of downloaded data and maximize performance. Genome sequences, annotation and pre-computed comparisons are stored in a relational database allowing flexible querying based on user-defined sequence regions, from whole genome to a defined region flanking a specified gene. Comparison and sequence files, whether computed online or retrieved from the database of pre-computed genome comparisons, can be viewed online using ACT and are available for download. AVAILABILITY: Freely accessible at http://www.webact.org. SUPPLEMENTARY INFORMATION: User guide and worked examples are available at http://www.webact.org/WebACT/docs.

Martin IMC, Ison CA, Aanensen DM, Fenton KA, Spratt BG. 2005. Changing epidemiologic profile of quinolone-resistant Neisseria gonorrhoeae in London. J Infect Dis, 192 (7), pp. 1191-1195. | Show Abstract | Read more

The percentage of quinolone-resistant Neisseria gonorrhoeae isolated in London increased between 2000 and 2003, from 0.9% to 7.9% of total isolates. This increase was investigated by genotyping resistant isolates and comparing demographic and behavioral data. In 2000, resistant isolates predominantly had unique sequence types (STs) that were associated with imported infection, whereas, in 2002 and 2003, large ST clusters of indistinguishable isolates were associated with endemic acquisition. Resistant isolates that belonged to these large clusters were typically from patients who had similar epidemiological characteristics (such as ethnicity and sexual orientation) and behavioral characteristics (such as multiple sex partners and previous gonorrhea). In London, quinolone resistance is no longer associated with importation from areas of high prevalence and is spreading endemically in high-risk groups.

Aanensen DM, Spratt BG. 2005. The multilocus sequence typing network: mlst.net. Nucleic Acids Res, 33 (Web Server issue), pp. W728-W733. | Show Abstract | Read more

The unambiguous characterization of strains of a pathogen is crucial for addressing questions relating to its epidemiology, population and evolutionary biology. Multilocus sequence typing (MLST), which defines strains from the sequences at seven house-keeping loci, has become the method of choice for molecular typing of many bacterial and fungal pathogens (and non-pathogens), and MLST schemes and strain databases are available for a growing number of prokaryotic and eukaryotic organisms. Sequence data are ideal for strain characterization as they are unambiguous, meaning strains can readily be compared between laboratories via the Internet. Laboratories undertaking MLST can quickly progress from sequencing the seven gene fragments to characterizing their strains and relating them to those submitted by others and to the population as a whole. We provide the gateway to a number of MLST schemes, each of which contain a set of tools for the initial characterization of strains, and methods for relating query strains to other strains of the species, including clustering based on differences in allelic profiles, phylogenetic trees based on concatenated sequences, and a recently developed method (eBURST) for identifying clonal complexes within a species and displaying the overall structure of the population. This network of MLST websites is available at http://www.mlst.net.

Spratt BG, Hanage WP, Li B, Aanensen DM, Feil EJ. 2004. Displaying the relatedness among isolates of bacterial species -- the eBURST approach. FEMS Microbiol Lett, 241 (2), pp. 129-134. | Show Abstract | Read more

Determining the most appropriate way to represent the relationships between bacterial isolates is complicated by the differing rates of recombination within species. In many cases, a bifurcating tree can be positively misleading. The recently described program eBURST can be used with multilocus data to define groups or clonal complexes of related isolates derived from a common ancestor, the patterns of descent linking them together, and the ancestral genotype. eBURST has recently been extensively updated to include additional tools for exploring the relationships between isolates. We discuss the advantages of this approach and describe its use to explore patterns of descent within clonal complexes identified using multilocus sequence typing.

Mavroidi A, Godoy D, Aanensen DM, Robinson DA, Hollingshead SK, Spratt BG. 2004. Evolutionary genetics of the capsular locus of serogroup 6 pneumococci. J Bacteriol, 186 (24), pp. 8181-8192. | Show Abstract | Read more

The evolution of the capsular biosynthetic (cps) locus of serogroup 6 Streptococcus pneumoniae was investigated by analyzing sequence variation within three serotype-specific cps genes from 102 serotype 6A and 6B isolates. Sequence variation within these cps genes was related to the genetic relatedness of the isolates, determined by multilocus sequence typing, and to the inferred patterns of recent evolutionary descent, explored using the eBURST algorithm. The serotype-specific cps genes had a low percent G+C, and there was a low level of sequence diversity in this region among serotype 6A and 6B isolates. There was also little sequence divergence between these serotypes, suggesting a single introduction of an ancestral cps sequence, followed by slight divergence to create serotypes 6A and 6B. A minority of serotype 6B isolates had cps sequences (class 2 sequences) that were approximately 5% divergent from those of other serotype 6B isolates (class 1 sequences) and which may have arisen by a second, more recent introduction from a related but distinct source. Expression of a serotype 6A or 6B capsule correlated perfectly with a single nonsynonymous polymorphism within wciP, the rhamnosyl transferase gene. In addition to ample evidence of the horizontal transfer of the serotype 6A and 6B cps locus into unrelated lineages, there was evidence for relatively frequent changes from serotype 6A to 6B, and vice versa, among very closely related isolates and examples of recent recombinational events between class 1 and 2 cps serogroup 6 sequences.

Bougnoux M-E, Aanensen DM, Morand S, Théraud M, Spratt BG, d'Enfert C. 2004. Multilocus sequence typing of Candida albicans: strategies, data exchange and applications. Infect Genet Evol, 4 (3), pp. 243-252. | Show Abstract | Read more

Multilocus sequence typing of Candida albicans: strategies, data exchange and applications. Bougnoux, M.-E., Aanensen, D.M., Morand, S., Théraud, M., Spratt, B.G., and d'Enfert, C. Infection, Genetics and Evolution. C. albicans is a commensal of humans and animals but is also the main fungal pathogen of humans, ranking fourth among the microorganisms responsible for hospital-acquired bloodstream infections. Information on the genetic diversity and dynamics of the C. albicans population and on the characteristics of C. albicans strains causing invasive infections in immunocompromised patients is important in order to adapt prevention policies. Important results in this field have been obtained using the Ca3 fingerprinting probe. Recently, multilocus sequence typing (MLST) based on the sequencing of 6-8 selected house-keeping genes and identification of polymorphic nucleotide sites has been introduced for the characterization of C. albicans isolates. Combination of the alleles at the different loci results in unique diploid sequence types (DSTs) that can be used to discriminate strains. MLST has now been successfully applied to study the epidemiology of C. albicans in the hospital as well as the diversity of C. albicans isolates obtained from diverse ecological niches including human and animal hosts. Furthermore, MLST data for C. albicans are available in a public database (http://calbicans.mlst.net) that provides a new resource to evaluate the worldwide diversity of C. albicans and the relationships of isolates identified at various locations.

Martin IMC, Ison CA, Aanensen DM, Fenton KA, Spratt BG. 2004. Rapid sequence-based identification of gonococcal transmission clusters in a large metropolitan area. J Infect Dis, 189 (8), pp. 1497-1505. | Show Abstract | Read more

In large metropolitan areas, which typically have the highest rates of gonorrhea, the identification of chains of transmission by use of partner notification is problematic, and there is an increasing interest in applying molecular approaches, which would require new discriminatory high-throughput procedures for recognizing clusters of indistinguishable gonococci, procedures that identify local chains of transmission. Sequencing of internal fragments of 2 highly polymorphic loci, from 436 isolates recovered in London during a 3-month period, identified clusters of antibiotic-resistant and antibiotic-susceptible isolates with indistinguishable genotypes, the vast majority of which were also identical or closely related by other methods, and defined groups of individuals who typically had similar demographic characteristics. This discriminatory sequence-based approach produces unambiguous data that easily can be compared via the Internet and appears to be suitable for the identification of linked cases of gonorrhea and the timely identification of transmission of antibiotic-resistant strains, even within large cities.

Feil EJ, Li BC, Aanensen DM, Hanage WP, Spratt BG. 2004. eBURST: inferring patterns of evolutionary descent among clusters of related bacterial genotypes from multilocus sequence typing data. J Bacteriol, 186 (5), pp. 1518-1530. | Show Abstract | Read more

The introduction of multilocus sequence typing (MLST) for the precise characterization of isolates of bacterial pathogens has had a marked impact on both routine epidemiological surveillance and microbial population biology. In both fields, a key prerequisite for exploiting this resource is the ability to discern the relatedness and patterns of evolutionary descent among isolates with similar genotypes. Traditional clustering techniques, such as dendrograms, provide a very poor representation of recent evolutionary events, as they attempt to reconstruct relationships in the absence of a realistic model of the way in which bacterial clones emerge and diversify to form clonal complexes. An increasingly popular approach, called BURST, has been used as an alternative, but present implementations are unable to cope with very large data sets and offer crude graphical outputs. Here we present a new implementation of this algorithm, eBURST, which divides an MLST data set of any size into groups of related isolates and clonal complexes, predicts the founding (ancestral) genotype of each clonal complex, and computes the bootstrap support for the assignment. The most parsimonious patterns of descent of all isolates in each clonal complex from the predicted founder(s) are then displayed. The advantages of eBURST for exploring patterns of evolutionary descent are demonstrated with a number of examples, including the simple Spain(23F)-1 clonal complex of Streptococcus pneumoniae, "population snapshots" of the entire S. pneumoniae and Staphylococcus aureus MLST databases, and the more complicated clonal complexes observed for Campylobacter jejuni and Neisseria meningitidis.

Godoy D, Randle G, Simpson AJ, Aanensen DM, Pitt TL, Kinoshita R, Spratt BG. 2003. Multilocus sequence typing and evolutionary relationships among the causative agents of melioidosis and glanders, Burkholderia pseudomallei and Burkholdefia mallei (vol 41, pg 2068, 2003) JOURNAL OF CLINICAL MICROBIOLOGY, 41 (10), pp. 4913-4913. | Read more

Godoy D, Randle G, Simpson AJ, Aanensen DM, Pitt TL, Kinoshita R, Spratt BG. 2003. Multilocus sequence typing and evolutionary relationships among the causative agents of melioidosis and glanders, Burkholderia pseudomallei and Burkholderia mallei. J Clin Microbiol, 41 (5), pp. 2068-2079. | Show Abstract | Read more

A collection of 147 isolates of Burkholderia pseudomallei, B. mallei, and B. thailandensis was characterized by multilocus sequence typing (MLST). The 128 isolates of B. pseudomallei, the causative agent of melioidosis, were obtained from diverse geographic locations, from humans and animals with disease, and from the environment and were resolved into 71 sequence types. The utility of the MLST scheme for epidemiological investigations was established by analyzing isolates from captive marine mammals and birds and from humans in Hong Kong with melioidosis. MLST gave a level of resolution similar to that given by pulsed-field gel electrophoresis and identified the same three clones causing disease in animals, each of which was also associated with disease in humans. The average divergence between the alleles of B. thailandensis and B. pseudomallei was 3.2%, and there was no sharing of alleles between these species. Trees constructed from differences in the allelic profiles of the isolates and from the concatenated sequences of the seven loci showed that the B. pseudomallei isolates formed a cluster of closely related lineages that were fully resolved from the cluster of B. thailandensis isolates, confirming their separate species status. However, isolates of B. mallei, the causative agent of glanders, recovered from three continents over a 30-year period had identical allelic profiles, and the B. mallei isolates clustered within the B. pseudomallei group of isolates. Alleles at six of the seven loci in B. mallei were also present within B. pseudomallei isolates, and B. mallei is a clone of B. pseudomallei that, on population genetics grounds, should not be given separate species status.

3447

Thank you for registering your interest

We were unable to record your request to register for interest in future opportunities. Please try again and if problems persist contact us at webteam@ndm.ox.ac.uk