Identifying recent adaptations in large-scale genomic data.
Grossman SR., Andersen KG., Shlyakhter I., Tabrizi S., Winnicki S., Yen A., Park DJ., Griesemer D., Karlsson EK., Wong SH., Cabili M., Adegbola RA., Bamezai RNK., Hill AVS., Vannberg FO., Rinn JL., 1000 Genomes Project None., Lander ES., Schaffner SF., Sabeti PC.
Although several hundred regions of the human genome harbor signals of positive natural selection, few of the relevant adaptive traits and variants have been elucidated. Using full-genome sequence variation from the 1000 Genomes (1000G) Project and the composite of multiple signals (CMS) test, we investigated 412 candidate signals and leveraged functional annotation, protein structure modeling, epigenetics, and association studies to identify and extensively annotate candidate causal variants. The resulting catalog provides a tractable list for experimental follow-up; it includes 35 high-scoring nonsynonymous variants, 59 variants associated with expression levels of a nearby coding gene or lincRNA, and numerous variants associated with susceptibility to infectious disease and other phenotypes. We experimentally characterized one candidate nonsynonymous variant in Toll-like receptor 5 (TLR5) and show that it leads to altered NF-κB signaling in response to bacterial flagellin. PAPERFLICK: