Genome-wide identification of lineage and locus specific variation associated with pneumococcal carriage duration
Lees JA., Croucher NJ., David G., Francois N., Julian P., Claudia T., Paul T., Stephen DB.
AbstractStreptococcus pneumoniae is a leading cause of invasive disease in infants, especially in low-income settings. Asymptomatic carriage in the nasopharynx is a prerequisite for disease, and the duration of carriage is an important consideration in modelling transmission dynamics and vaccine response. Existing studies of carriage duration variability are based at the serotype level only, and do not probe variation within lineages or fully quantify interactions with other environmental factors.Here we developed a model to calculate the duration of carriage episodes from longitudinal swab data. By combining these results with whole genome sequence data we estimate that pneumococcal genomic variation accounted for 63% of the phenotype variation, whereas host traits accounted for less than 5%. We further partitioned this heritability into both lineage and locus effects, and quantified the amount attributable to the largest sources of variation in carriage duration: serotype (17%), drug-resistance (9%) and other significant locus effects (7%). For the locus effects, a genome-wide association study identified 16 loci which may have an effect on carriage duration independent of serotype. Hits at a genome-wide level of significance were to prophage sequences, suggesting infection by such viruses substantially affects carriage duration.These results show that both serotype and non-serotype specific effects alter carriage duration in infants and young children and are more important than other environmental factors such as host genetics. This has implications for models of pneumococcal competition and antibiotic resistance, and leads the way for the analysis of heritability of complex bacterial traits.Significance statementOther than serotype, the genetic determinants of pneumococcal carriage duration are unknown. In this study we used longitudinal sampling to measure the duration of carriage in infants, and searched for any associated variation in the pan-genome. While we found that the pathogen genome explains most of the variability in duration, serotype did not fully account for this. Recent theoretical work has proposed the existence of alleles which alter carriage duration to explain the puzzle of continued coexistence of antibiotic-resistant and sensitive strains. Here we have shown that these alleles do exist in a natural population, and also identified candidates for the loci which fulfil this role. Together these findings have implications for future modelling of pneumococcal epidemiology and resistance.