Highly Commended in Overall Graduate Prize
I completed my BSc in physics at University College Cork during which I developed a keen interest in analytics methods and research software development. During my undergraduate I worked on projects involving data analysis of complex experimental datasets relating to high temperature super conductivity and quantised conductance. Following this, I had the opportunity to complete an MPhil in Computational Biology at Fitzwilliam college, Cambridge where I worked on machine learning models for predicting the quality of genetic variant calls from DNA sequencing data. This work was eventually implemented in Illumina’s variant calling pipeline.
From there, I was awarded a DPhil studentship from the Wellcome Trust to take part in the Genomic Medicine and Statistics programme where I joined Zamin Iqbal’s group. I began my work on developing graph based analysis of clinical microbial whole-genome sequencing data. In particular, my goal was to develop tools to aid the detection, analysis, and surveillance of antimicrobial resistance–a persistent and growing threat to global health.
My first project was to develop software to rapidly predict antimicrobial resistance and identify species from sequencing data: Mykrobe predictor. Mykrobe predictor was published in Nature Communications in December 2015 and has since been incorporated in Public Health England’s sequencing based pipeline for processing M. tuberculosis. As a result, Mykrobe predictor now helps to analyse every M. tuberculosis case in the United Kingdom.
In close collaboration with clinicians and epidemiologists at the John Radcliffe Hospital, Oxford, we extended Mykrobe predictor to analyse sequencing data from Oxford Nanopore’s Minion–one of the first demonstrations that this technology could be used to rapidly detect clinically relevant bacterial variants. In doing so, this could reduce the time taken to test a M. tuberculosis isolate from 2 weeks to 12 hours–improving the speed at which doctors receive vital information on their patient’s infection.
In parallel to the above work, my PhD research has also involved trying to address a pressing problem: that most microbial genome sequence data is inaccessible to search, despite being archived centrally and publically available. We developed a novel method can index and search collections of millions of genomic datasets: Coloured Bloom Graphs. Using this software, we have indexed a snapshot of all bacterial and viral whole-genome sequence that has ever been archived which can be searched in milliseconds. We hope that opening up these datasets to rapid search could be a resource for science and real-time public health epidemiology.
Working with the Iqbal group in the Nuffield department has been a fantastic experience and I’m extremely grateful for the training and support I received as well as for the opportunity to collaborate with exceptionally talented people. My hope is that some of the research I have conducted here will enable us better understand and tackle the growing problem of antimicrobial resistance–to that end I’m hoping to work with the EBI in Cambridge, UK to further develop this software into web-services that researchers around the world can use.
Votintseva AA*, Bradley P*, Pankhurst L*, Del Ojo Elias C, Loose M, Nilgiriwala K, Chatterjee A, Smith EG, Sanderson N, Walker TM, Morgan MR, Wyllie DH, Walker AS, Peto TEA, Crook DW, Iqbal Z. Same-Day Diagnostic and Surveillance Data for Tuberculosis via Whole-Genome Sequencing of Direct Respiratory Samples. Journal of clinical microbiology Volume 55 (2017) p.1285-1298 *Co-first authors
Bradley, P., Gordon, N.C., Walker, T.M., Dunn, L., Heys, S., Huang, B., Earle, S., Pankhurst, L.J., Anson, L., De Cesare, M. and Piazza, P., 2015. Rapid antibiotic-resistance predictions from genome sequence data for Staphylococcus aureus and Mycobacterium tuberculosis. Nature communications, 6, p.10063.