Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

The field of genomic epidemiology is rapidly growing as many jurisdictions begin to deploy whole-genome sequencing (WGS) in their national or regional pathogen surveillance programmes. WGS data offer a rich view of the shared ancestry of a set of taxa, typically visualized with phylogenetic trees illustrating the clusters or subtypes present in a group of taxa, their relatedness and the extent of diversification within and between them. When methicillin-resistant Staphylococcus aureus (MRSA) arose and disseminated widely, phylogenetic trees of MRSA-containing types of S. aureus had a distinctive 'comet' shape, with a 'comet head' of recently adapted drug-resistant isolates in the context of a 'comet tail' that was predominantly drug-sensitive. Placing an S. aureus isolate in the context of such a 'comet' helped public health laboratories interpret local data within the broader setting of S. aureus evolution. In this work, we ask what other tree shapes, analogous to the MRSA comet, are present in bacterial WGS datasets. We extract trees from large bacterial genomic datasets, visualize them as images and cluster the images. We find nine major groups of tree images, including the 'comets', star-like phylogenies, 'barbell' phylogenies and other shapes, and comment on the evolutionary and epidemiological stories these shapes might illustrate. This article is part of a discussion meeting issue 'Genomic population structures of microbial pathogens'.

Original publication




Journal article


Philos Trans R Soc Lond B Biol Sci

Publication Date





clustering, deep learning, image processing, phylogenetics, Anti-Bacterial Agents, Cluster Analysis, Genome, Bacterial, Humans, Methicillin-Resistant Staphylococcus aureus, Phylogeny, Staphylococcal Infections, Staphylococcus aureus