Real-time detection of multidrug resistant tuberculosis and transmission in England

Project Overview

The World Health Organization estimates that two billion people are infected with tuberculosis, of whom there are 1.5 million deaths per year. Cases of multidrug-resistant tuberculosis are difficult to treat and continue to spread. In Britain, Public Health England and researchers from the University of Oxford are leading international efforts to develop rapid tests using genomics to identify multidrug resistant strains more quickly and identify transmission clusters as new cases emerge.

This year, whole genome sequencing will be rolled out across England and Wales to all new cases, and retroactively to an archive of over 2000 strains. Pilot studies have shown that genomics offers unparalleled resolution for rapidly detecting multidrug resistance and identifying recent transmission, but major challenges remain. Important obstacles to the widespread uptake of genomics are the development of fast computational methods for quickly recognizing closely related cases among databases that will soon harbour 100,000s strains, the integration of forward-looking genomic approaches with a vast body of expertise based on earlier molecular typing schemes, and the development of future-proof taxonomy for strain characterization and identification. The aim of this project is to address these challenges with a range of computational approaches drawing on evolution, population genetics, statistics and bioinformatics.

David Wyllie is a clinician, microbiologist, molecular biologist and data scientist based jointly at the University of Oxford Nuffield Department of Medicine and Public Health England. Danny Wilson is a Sir Henry Dale Fellow working on microbial genomics at the University of Oxford Nuffield Department of Medicine. He is an associate member of the Department of Statistics and a principal investigator at the Oxford Martin School Institute of Emerging Infections.

This project is part of the Modernising Medical Microbiology Project, an international consortium led by Professor Derrick Crook, professor of microbiology at the University of Oxford and director of the National Infection Service at Public Health England. The consortium is supported by a multi-million pound funding portfolio, including from the Wellcome Trust, the Medical Research Council and the Bill and Melinda Gates Foundation. We have a strong record of publishing in the top internationally recognized journals, and our research has impacted on the delivery of public health and microbiology in Britain and beyond. For more information visit

Training Opportunities

The Modernising Medical Microbiology consortium provides an excellent research environment in which to develop new skills and train among world-leading scientists in their field. Based at the John Radcliffe Hospital, the University of Oxford team consists of a community of research groups led by Profs Derrick Crook, Tim Peto, Sarah Walker and Drs David Clifton, Kate Dingle, Phil Fowler, Zam Iqbal, Danny Wilson and David Wyllie. We have specialist expertise in microbiology, genomics, statistics, epidemiology and bioinformatics, and our work focuses on understanding the causes of infectious disease in populations. Training is provided by weekly supervisory meetings, weekly Modernising Medical Microbiology work-in-progress meetings, journal clubs, seminar series, and external opportunities including attending national and international conferences. The department and university run training courses, while the Department for Continuing Education and the Language Centre offer further opportunities for personal development to research students at Oxford.


Genetics & Genomics and Immunology & Infectious Disease


Project reference number: 828

Funding and admissions information


Name Department Institution Country Email
Professor Daniel J Wilson Experimental Medicine Division Oxford University, John Radcliffe Hospital GBR
Dr David Wyllie Jenner Institute Oxford University, Henry Wellcome Building for Molecular Physiology GBR

Earle SG, Wu CH, Charlesworth J, Stoesser N, Gordon NC, Walker TM, Spencer CC, Iqbal Z, Clifton DA, Hopkins KL, Woodford N, Smith EG, Ismail N, Llewelyn MJ, Peto TE, Crook DW, McVean G, Walker AS, Wilson DJ. 2016. Identifying lineage effects when controlling for population structure improves power in bacterial association studies. Nat Microbiol, 1 pp. 16041. Read abstract | Read more

Bacteria pose unique challenges for genome-wide association studies because of strong structuring into distinct strains and substantial linkage disequilibrium across the genome(1,2). Although methods developed for human studies can correct for strain structure(3,4), this risks considerable loss-of-power because genetic differences between strains often contribute substantial phenotypic variability(5). Here, we propose a new method that captures lineage-level associations even when locus-specific associations cannot be fine-mapped. We demonstrate its ability to detect genes and genetic variants underlying resistance to 17 antimicrobials in 3,144 isolates from four taxonomically diverse clonal and recombining bacteria: Mycobacterium tuberculosis, Staphylococcus aureus, Escherichia coli and Klebsiella pneumoniae. Strong selection, recombination and penetrance confer high power to recover known antimicrobial resistance mechanisms and reveal a candidate association between the outer membrane porin nmpC and cefazolin resistance in E. coli. Hence, our method pinpoints locus-specific effects where possible and boosts power by detecting lineage-level differences when fine-mapping is intractable. Hide abstract

Das S, Lindemann C, Young BC, Muller J, Österreich B, Ternette N, Winkler AC, Paprotka K, Reinhardt R, Förstner KU, Allen E, Flaxman A, Yamaguchi Y, Rollier CS, van Diemen P, Blättner S, Remmele CW, Selle M, Dittrich M, Müller T, Vogel J, Ohlsen K, Crook DW, Massey R, Wilson DJ, Rudel T, Wyllie DH, Fraunholz MJ. 2016. Natural mutations in a Staphylococcus aureus virulence regulator attenuate cytotoxicity but permit bacteremia and abscess formation. Proc. Natl. Acad. Sci. U.S.A., 113 (22), pp. E3101-10. Read abstract | Read more

Staphylococcus aureus is a major bacterial pathogen, which causes severe blood and tissue infections that frequently emerge by autoinfection with asymptomatically carried nose and skin populations. However, recent studies report that bloodstream isolates differ systematically from those found in the nose and skin, exhibiting reduced toxicity toward leukocytes. In two patients, an attenuated toxicity bloodstream infection evolved from an asymptomatically carried high-toxicity nasal strain by loss-of-function mutations in the gene encoding the transcription factor repressor of surface proteins (rsp). Here, we report that rsp knockout mutants lead to global transcriptional and proteomic reprofiling, and they exhibit the greatest signal in a genome-wide screen for genes influencing S. aureus survival in human cells. This effect is likely to be mediated in part via SSR42, a long-noncoding RNA. We show that rsp controls SSR42 expression, is induced by hydrogen peroxide, and is required for normal cytotoxicity and hemolytic activity. Rsp inactivation in laboratory- and bacteremia-derived mutants attenuates toxin production, but up-regulates other immune subversion proteins and reduces lethality during experimental infection. Crucially, inactivation of rsp preserves bacterial dissemination, because it affects neither formation of deep abscesses in mice nor survival in human blood. Thus, we have identified a spontaneously evolving, attenuated-cytotoxicity, nonhemolytic S. aureus phenotype, controlled by a pleiotropic transcriptional regulator/noncoding RNA virulence regulatory system, capable of causing S. aureus bloodstream infections. Such a phenotype could promote deep infection with limited early clinical manifestations, raising concerns that bacterial evolution within the human body may contribute to severe infection. Hide abstract

De Silva D, Peters J, Cole K, Cole MJ, Cresswell F, Dean G, Dave J, Thomas DR, Foster K, Waldram A, Wilson DJ, Didelot X, Grad YH, Crook DW, Peto TE, Walker AS, Paul J, Eyre DW. 2016. Whole-genome sequencing to determine transmission of Neisseria gonorrhoeae: an observational study. Lancet Infect Dis, 16 (11), pp. 1295-1303. Read abstract | Read more

BACKGROUND: New approaches are urgently required to address increasing rates of gonorrhoea and the emergence and global spread of antibiotic-resistant Neisseria gonorrhoeae. We used whole-genome sequencing to study transmission and track resistance in N gonorrhoeae isolates. METHODS: We did whole-genome sequencing of isolates obtained from samples collected from patients attending sexual health services in Brighton, UK, between Jan 1, 2011, and March 9, 2015. We also included isolates from other UK locations, historical isolates from Brighton, and previous data from a US study. Samples from symptomatic patients and asymptomatic sexual health screening underwent nucleic acid amplification testing; positive samples and all samples from symptomatic patients were cultured for N gonorrhoeae, and resulting isolates were whole-genome sequenced. Cefixime susceptibility testing was done in selected isolates by agar incorporation, and we used sequence data to determine multi-antigen sequence types and penA genotypes. We derived a transmission nomogram to determine the plausibility of direct or indirect transmission between any two cases depending on the time between samples: estimated mutation rates, plus diversity noted within patients across anatomical sites and probable transmission pairs, were used to fit a coalescent model to determine the number of single nucleotide polymorphisms expected. FINDINGS: 1407 (98%) of 1437 Brighton isolates between Jan 1, 2011, and March 9, 2015 were successfully sequenced. We identified 1061 infections from 907 patients. 281 (26%) of these infections were indistinguishable (ie, differed by zero single nucleotide polymorphisms) from one or more previous cases, and 786 (74%) had evidence of a sampled direct or indirect Brighton source. We observed multiple related samples across geographical locations. Of 1273 infections in Brighton (including historical data), 225 (18%) were linked to another case elsewhere in the UK, and 115 (9%) to a case in the USA. Four lineages initially identified in Brighton could be linked to 70 USA sequences, including 61 from a lineage carrying the mosaic penA XXXIV allele, which is associated with reduced cefixime susceptibility. INTERPRETATION: We present a whole-genome-sequencing-based tool for genomic contact tracing of N gonorrhoeae and demonstrate local, national, and international transmission. Whole-genome sequencing can be applied across geographical boundaries to investigate gonorrhoea transmission and to track antimicrobial resistance. FUNDING: Oxford National Institute for Health Research Health Protection Research Unit and Biomedical Research Centre. Hide abstract

Laabei M, Uhlemann AC, Lowy FD, Austin ED, Yokoyama M, Ouadi K, Feil E, Thorpe HA, Williams B, Perkins M, Peacock SJ, Clarke SR, Dordel J, Holden M, Votintseva AA, Bowden R, Crook DW, Young BC, Wilson DJ, Recker M, Massey RC. 2015. Evolutionary Trade-Offs Underlie the Multi-faceted Virulence of Staphylococcus aureus. PLoS Biol., 13 (9), pp. e1002229. Read abstract | Read more

Bacterial virulence is a multifaceted trait where the interactions between pathogen and host factors affect the severity and outcome of the infection. Toxin secretion is central to the biology of many bacterial pathogens and is widely accepted as playing a crucial role in disease pathology. To understand the relationship between toxicity and bacterial virulence in greater depth, we studied two sequenced collections of the major human pathogen Staphylococcus aureus and found an unexpected inverse correlation between bacterial toxicity and disease severity. By applying a functional genomics approach, we identified several novel toxicity-affecting loci responsible for the wide range in toxic phenotypes observed within these collections. To understand the apparent higher propensity of low toxicity isolates to cause bacteraemia, we performed several functional assays, and our findings suggest that within-host fitness differences between high- and low-toxicity isolates in human serum is a contributing factor. As invasive infections, such as bacteraemia, limit the opportunities for onward transmission, highly toxic strains could gain an additional between-host fitness advantage, potentially contributing to the maintenance of toxicity at the population level. Our results clearly demonstrate how evolutionary trade-offs between toxicity, relative fitness, and transmissibility are critical for understanding the multifaceted nature of bacterial virulence. Hide abstract

De Maio N, Wu CH, O'Reilly KM, Wilson D. 2015. New Routes to Phylogeography: A Bayesian Structured Coalescent Approximation. PLoS Genet., 11 (8), pp. e1005421. Read abstract | Read more

Phylogeographic methods aim to infer migration trends and the history of sampled lineages from genetic data. Applications of phylogeography are broad, and in the context of pathogens include the reconstruction of transmission histories and the origin and emergence of outbreaks. Phylogeographic inference based on bottom-up population genetics models is computationally expensive, and as a result faster alternatives based on the evolution of discrete traits have become popular. In this paper, we show that inference of migration rates and root locations based on discrete trait models is extremely unreliable and sensitive to biased sampling. To address this problem, we introduce BASTA (BAyesian STructured coalescent Approximation), a new approach implemented in BEAST2 that combines the accuracy of methods based on the structured coalescent with the computational efficiency required to handle more than just few populations. We illustrate the potentially severe implications of poor model choice for phylogeographic analyses by investigating the zoonotic transmission of Ebola virus. Whereas the structured coalescent analysis correctly infers that successive human Ebola outbreaks have been seeded by a large unsampled non-human reservoir population, the discrete trait analysis implausibly concludes that undetected human-to-human transmission has allowed the virus to persist over the past four decades. As genomics takes on an increasingly prominent role informing the control and prevention of infectious diseases, it will be vital that phylogeographic inference provides robust insights into transmission history. Hide abstract

Everitt RG, Didelot X, Batty EM, Miller RR, Knox K, Young BC, Bowden R, Auton A, Votintseva A, Larner-Svensson H, Charlesworth J, Golubchik T, Ip CL, Godwin H, Fung R, Peto TE, Walker AS, Crook DW, Wilson DJ. 2014. Mobile elements drive recombination hotspots in the core genome of Staphylococcus aureus. Nat Commun, 5 pp. 3956. Read abstract | Read more

Horizontal gene transfer is an important driver of bacterial evolution, but genetic exchange in the core genome of clonal species, including the major pathogen Staphylococcus aureus, is incompletely understood. Here we reveal widespread homologous recombination in S. aureus at the species level, in contrast to its near-complete absence between closely related strains. We discover a patchwork of hotspots and coldspots at fine scales falling against a backdrop of broad-scale trends in rate variation. Over megabases, homoplasy rates fluctuate 1.9-fold, peaking towards the origin-of-replication. Over kilobases, we find core recombination hotspots of up to 2.5-fold enrichment situated near fault lines in the genome associated with mobile elements. The strongest hotspots include regions flanking conjugative transposon ICE6013, the staphylococcal cassette chromosome (SCC) and genomic island νSaα. Mobile element-driven core genome transfer represents an opportunity for adaptation and challenges our understanding of the recombination landscape in predominantly clonal pathogens, with important implications for genotype-phenotype mapping. Hide abstract

Eyre DW, Cule ML, Wilson DJ, Griffiths D, Vaughan A, O'Connor L, Ip CL, Golubchik T, Batty EM, Finney JM, Wyllie DH, Didelot X, Piazza P, Bowden R, Dingle KE, Harding RM, Crook DW, Wilcox MH, Peto TE, Walker AS. 2013. Diverse sources of C. difficile infection identified on whole-genome sequencing. N. Engl. J. Med., 369 (13), pp. 1195-205. Read abstract | Read more

BACKGROUND: It has been thought that Clostridium difficile infection is transmitted predominantly within health care settings. However, endemic spread has hampered identification of precise sources of infection and the assessment of the efficacy of interventions. METHODS: From September 2007 through March 2011, we performed whole-genome sequencing on isolates obtained from all symptomatic patients with C. difficile infection identified in health care settings or in the community in Oxfordshire, United Kingdom. We compared single-nucleotide variants (SNVs) between the isolates, using C. difficile evolution rates estimated on the basis of the first and last samples obtained from each of 145 patients, with 0 to 2 SNVs expected between transmitted isolates obtained less than 124 days apart, on the basis of a 95% prediction interval. We then identified plausible epidemiologic links among genetically related cases from data on hospital admissions and community location. RESULTS: Of 1250 C. difficile cases that were evaluated, 1223 (98%) were successfully sequenced. In a comparison of 957 samples obtained from April 2008 through March 2011 with those obtained from September 2007 onward, a total of 333 isolates (35%) had no more than 2 SNVs from at least 1 earlier case, and 428 isolates (45%) had more than 10 SNVs from all previous cases. Reductions in incidence over time were similar in the two groups, a finding that suggests an effect of interventions targeting the transition from exposure to disease. Of the 333 patients with no more than 2 SNVs (consistent with transmission), 126 patients (38%) had close hospital contact with another patient, and 120 patients (36%) had no hospital or community contact with another patient. Distinct subtypes of infection continued to be identified throughout the study, which suggests a considerable reservoir of C. difficile. CONCLUSIONS: Over a 3-year period, 45% of C. difficile cases in Oxfordshire were genetically distinct from all previous cases. Genetically diverse sources, in addition to symptomatic patients, play a major part in C. difficile transmission. (Funded by the U.K. Clinical Research Collaboration Translational Infection Research Initiative and others.). Hide abstract

Young BC, Golubchik T, Batty EM, Fung R, Larner-Svensson H, Votintseva AA, Miller RR, Godwin H, Knox K, Everitt RG, Iqbal Z, Rimmer AJ, Cule M, Ip CL, Didelot X, Harding RM, Donnelly P, Peto TE, Crook DW, Bowden R, Wilson DJ. 2012. Evolutionary dynamics of Staphylococcus aureus during progression from carriage to disease. Proc. Natl. Acad. Sci. U.S.A., 109 (12), pp. 4550-5. Read abstract | Read more

Whole-genome sequencing offers new insights into the evolution of bacterial pathogens and the etiology of bacterial disease. Staphylococcus aureus is a major cause of bacteria-associated mortality and invasive disease and is carried asymptomatically by 27% of adults. Eighty percent of bacteremias match the carried strain. However, the role of evolutionary change in the pathogen during the progression from carriage to disease is incompletely understood. Here we use high-throughput genome sequencing to discover the genetic changes that accompany the transition from nasal carriage to fatal bloodstream infection in an individual colonized with methicillin-sensitive S. aureus. We found a single, cohesive population exhibiting a repertoire of 30 single-nucleotide polymorphisms and four insertion/deletion variants. Mutations accumulated at a steady rate over a 13-mo period, except for a cluster of mutations preceding the transition to disease. Although bloodstream bacteria differed by just eight mutations from the original nasally carried bacteria, half of those mutations caused truncation of proteins, including a premature stop codon in an AraC-family transcriptional regulator that has been implicated in pathogenicity. Comparison with evolution in two asymptomatic carriers supported the conclusion that clusters of protein-truncating mutations are highly unusual. Our results demonstrate that bacterial diversity in vivo is limited but nonetheless detectable by whole-genome sequencing, enabling the study of evolutionary dynamics within the host. Regulatory or structural changes that occur during carriage may be functionally important for pathogenesis; therefore identifying those changes is a crucial step in understanding the biological causes of invasive bacterial disease. Hide abstract

Laabei M, Recker M, Rudkin JK, Aldeljawi M, Gulay Z, Sloan TJ, Williams P, Endres JL, Bayles KW, Fey PD, Yajjala VK, Widhelm T, Hawkins E, Lewis K, Parfett S, Scowen L, Peacock SJ, Holden M, Wilson D, Read TD, van den Elsen J, Priest NK, Feil EJ, Hurst LD, Josefsson E, Massey RC. 2014. Predicting the virulence of MRSA from its genome sequence. Genome Res., 24 (5), pp. 839-49. Read abstract | Read more

Microbial virulence is a complex and often multifactorial phenotype, intricately linked to a pathogen's evolutionary trajectory. Toxicity, the ability to destroy host cell membranes, and adhesion, the ability to adhere to human tissues, are the major virulence factors of many bacterial pathogens, including Staphylococcus aureus. Here, we assayed the toxicity and adhesiveness of 90 MRSA (methicillin resistant S. aureus) isolates and found that while there was remarkably little variation in adhesion, toxicity varied by over an order of magnitude between isolates, suggesting different evolutionary selection pressures acting on these two traits. We performed a genome-wide association study (GWAS) and identified a large number of loci, as well as a putative network of epistatically interacting loci, that significantly associated with toxicity. Despite this apparent complexity in toxicity regulation, a predictive model based on a set of significant single nucleotide polymorphisms (SNPs) and insertion and deletions events (indels) showed a high degree of accuracy in predicting an isolate's toxicity solely from the genetic signature at these sites. Our results thus highlight the potential of using sequence data to determine clinically relevant parameters and have further implications for understanding the microbial virulence of this opportunistic pathogen. Hide abstract