register interest

David Wedge

Research Area: Bioinformatics & Stats (inc. Modelling and Computational Biology)
Technology Exchange: Bioinformatics, Computational biology and Statistical genetics
Scientific Themes: Cancer Biology and Genetics & Genomics

The focus of my research is cancer evolution and heterogeneity. Cancers are made up of a heterogeneous mix of cells, each bearing a different set of mutations in its DNA. We aim to characterise groups of cells, or ‘subclones’, according to their mutational profiles and to study the interaction between subclones.

Tumours are difficult to treat because they change over time, gaining mutations that enable them to metastasise to distant organs or that result in resistance to treatment. By comparing multiple samples, we can identify those mutations that cause relapse and progression. Using genetic markers, we can also track the spread of disease, giving us insights into the mechanisms and processes involved in cancer growth and metastasis.

Cancer is a complex disease and the analysis of large numbers of tumours is key to understanding the factors that determine their virulence. The International Cancer Genome Consortium (ICGC) has collected and whole-genome sequenced several thousand cancer samples. I co-lead the Pan-Cancer Working Group on Evolution and Heterogeneity, an international collaboration that is using the DNA sequences of 2700 of these cancer samples to study evolution and heterogeneity across more than 30 different cancer types, including prostate, breast, lung, oesophageal and ovarian cancers.

The 100,000 Genomes project has whole-genome sequenced 25,000 cancers through Genomics England. Through the NIHR Oxford Biomedical Centre (BRC) Molecular Diagnostics theme, members of my research group are analysing the DNA sequences of colorectal, endometrial and testicular cancers from the 100,000 Genomes project, with the aim of extending the scope of precision medicine. Through the BRC, we are also using novel sequencing technologies to gain further insights into tumour heterogeneity.

Through the Pan Prostate Cancer Group (PPCG), we are analysing Whole Genome Sequences of over 1200 prostate cancers. The main aims of this project are to: identify multimodal biomarkers of aggressive disease; provide new insights into prostate cancer diagnosis and mechanisms of development; identify new insights into ethnic differences in the propensity to develop aggressive prostate cancer; identify new markers of genetic predisposition to prostate cancer.

Name Department Institution Country
Ian Tomlinson Wellcome Trust Centre for Human Genetics Oxford University, Henry Wellcome Building of Genomic Medicine United Kingdom
Professor Tim Maughan Gray Institute for Radiation Oncology and Biology Oxford University United Kingdom
Jenny Taylor Wellcome Trust Centre for Human Genetics Oxford University, Henry Wellcome Building of Genomic Medicine United Kingdom
David Church Wellcome Trust Centre for Human Genetics Oxford University, Henry Wellcome Building of Genomic Medicine United Kingdom
Simon Leedham Wellcome Trust Centre for Human Genetics Oxford University, Henry Wellcome Building of Genomic Medicine United Kingdom
Shazia Irshad Wellcome Trust Centre for Human Genetics Oxford University, Henry Wellcome Building of Genomic Medicine United Kingdom
Cheng J, Demeulemeester J, Wedge DC, Vollan HKM, Pitt JJ, Russnes HG, Pandey BP, Nilsen G, Nord S, Bignell GR et al. 2019. Author Correction: Pan-cancer analysis of homozygous deletions in primary tumours uncovers rare tumour suppressors. Nat Commun, 10 (1), pp. 525. | Show Abstract | Read more

The original version of this Article omitted a declaration from the competing interests statement, which should have included the following: 'K.P.W. is President of Tempus Lab, Inc., Chicago, IL, USA'. This has now been corrected in both the PDF and HTML versions of the Article.

Hancock BA, Chen Y-H, Solzak JP, Ahmad MN, Wedge DC, Brinza D, Scafe C, Veitch J, Gottimukkala R, Short W et al. 2019. Profiling molecular regulators of recurrence in chemorefractory triple-negative breast cancers. Breast Cancer Res, 21 (1), pp. 87. | Show Abstract | Read more

BACKGROUND: Approximately two thirds of patients with localized triple-negative breast cancer (TNBC) harbor residual disease (RD) after neoadjuvant chemotherapy (NAC) and have a high risk-of-recurrence. Targeted therapeutic development for TNBC is of primary significance as no targeted therapies are clinically indicated for this aggressive subset. In view of this, we conducted a comprehensive molecular analysis and correlated molecular features of chemorefractory RD tumors with recurrence for the purpose of guiding downstream therapeutic development. METHODS: We assembled DNA and RNA sequencing data from RD tumors as well as pre-operative biopsies, lymphocytic infiltrate, and survival data as part of a molecular correlative to a phase II post-neoadjuvant clinical trial. Matched somatic mutation, gene expression, and lymphocytic infiltrate were assessed before and after chemotherapy to understand how tumors evolve during chemotherapy. Kaplan-Meier survival analyses were conducted categorizing cancers with TP53 mutations by the degree of loss as well as by the copy number of a locus of 18q corresponding to the SMAD2, SMAD4, and SMAD7 genes. RESULTS: Analysis of matched somatic genomes pre-/post-NAC revealed chaotic acquisition of copy gains and losses including amplification of prominent oncogenes. In contrast, significant gains in deleterious point mutations and insertion/deletions were not observed. No trends between clonal evolution and recurrence were identified. Gene expression data from paired biopsies revealed enrichment of actionable regulators of stem cell-like behavior and depletion of immune signaling, which was corroborated by total lymphocytic infiltrate, but was not associated with recurrence. Novel characterization of TP53 mutation revealed prognostically relevant subgroups, which were linked to MYC-driven transcriptional amplification. Finally, somatic gains in 18q were associated with poor prognosis, likely driven by putative upregulation of TGFß signaling through the signal transducer SMAD2. CONCLUSIONS: We conclude TNBCs are dynamic during chemotherapy, demonstrating complex plasticity in subclonal diversity, stem-like qualities, and immune depletion, but somatic alterations of TP53/MYC and TGFß signaling in RD samples are prominent drivers of recurrence, representing high-yield targets for additional interrogation.

Petljak M, Alexandrov LB, Brammeld JS, Price S, Wedge DC, Grossmann S, Dawson KJ, Ju YS, Iorio F, Tubio JMC et al. 2019. Characterizing Mutational Signatures in Human Cancer Cell Lines Reveals Episodic APOBEC Mutagenesis. Cell, 176 (6), pp. 1282-1294.e20. | Show Abstract | Read more

Multiple signatures of somatic mutations have been identified in cancer genomes. Exome sequences of 1,001 human cancer cell lines and 577 xenografts revealed most common mutational signatures, indicating past activity of the underlying processes, usually in appropriate cancer types. To investigate ongoing patterns of mutational-signature generation, cell lines were cultured for extended periods and subsequently DNA sequenced. Signatures of discontinued exposures, including tobacco smoke and ultraviolet light, were not generated in vitro. Signatures of normal and defective DNA repair and replication continued to be generated at roughly stable mutation rates. Signatures of APOBEC cytidine deaminase DNA-editing exhibited substantial fluctuations in mutation rate over time with episodic bursts of mutations. The initiating factors for the bursts are unclear, although retrotransposon mobilization may contribute. The examined cell lines constitute a resource of live experimental models of mutational processes, which potentially retain patterns of activity and regulation operative in primary human cancers.

Nik-Zainal S, Davies H, Staaf J, Ramakrishna M, Glodzik D, Zou X, Martincorena I, Alexandrov LB, Martin S, Wedge DC et al. 2019. Author Correction: Landscape of somatic mutations in 560 breast cancer whole-genome sequences. Nature, 566 (7742), pp. E1. | Show Abstract | Read more

In the Methods section of this Article, 'greater than' should have been 'less than' in the sentence 'Putative regions of clustered rearrangements were identified as having an average inter-rearrangement distance that was at least 10 times greater than the whole-genome average for the individual sample. '. The Article has not been corrected.

Tarabichi M, Martincorena I, Gerstung M, Leroi AM, Markowetz F, PCAWG Evolution and Heterogeneity Working Group, Spellman PT, Morris QD, Lingjærde OC, Wedge DC, Van Loo P. 2018. Neutral tumor evolution? Nat Genet, 50 (12), pp. 1630-1633. | Read more

Cheng J, Demeulemeester J, Wedge DC, Vollan HKM, Pitt JJ, Russnes HG, Pandey BP, Nilsen G, Nord S, Bignell GR et al. 2018. Author Correction: Pan-cancer analysis of homozygous deletions in primary tumours uncovers rare tumour suppressors. Nat Commun, 9 (1), pp. 5397. | Show Abstract | Read more

The original version of this Article contained an error in the author affiliations. The affiliation of Kevin P. White with Tempus Labs, Inc., Chicago, IL, USA was inadvertently omitted.This has now been corrected in both the PDF and HTML versions of the Article.

Höglander EK, Nord S, Wedge DC, Lingjærde OC, Silwal-Pandit L, Gythfeldt HV, Vollan HKM, Fleischer T, Krohn M, Schlitchting E et al. 2018. Time series analysis of neoadjuvant chemotherapy and bevacizumab-treated breast carcinomas reveals a systemic shift in genomic aberrations. Genome Med, 10 (1), pp. 92. | Show Abstract | Read more

BACKGROUND: Chemotherapeutic agents such as anthracyclines and taxanes are commonly used in the neoadjuvant setting. Bevacizumab is an antibody which binds to vascular endothelial growth factor A (VEGFA) and inhibits its receptor interaction, thus obstructing the formation of new blood vessels. METHODS: A phase II randomized clinical trial of 123 patients with Her2-negative breast cancer was conducted, with patients treated with neoadjuvant chemotherapy (fluorouracil (5FU)/epirubicin/cyclophosphamide (FEC) and taxane), with or without bevacizumab. Serial biopsies were obtained at time of diagnosis, after 12 weeks of treatment with FEC ± bevacizumab, and after 25 weeks of treatment with taxane ± bevacizumab. A time course study was designed to investigate the genomic landscape at the three time points when tumor DNA alterations, tumor percentage, genomic instability, and tumor clonality were assessed. Substantial differences were observed with some tumors changing mainly between diagnosis and at 12 weeks, others between 12 and 25 weeks, and still others changing in both time periods. RESULTS: In both treatment arms, good responders (GR) and non-responders (NR) displayed significant difference in genomic instability index (GII) at time of diagnosis. In the combination arm, copy number alterations at 25 loci at the time of diagnosis were significantly different between the GR and NR. An inverse aberration pattern was also observed between the two extreme response groups at 6p22-p12 for patients in the combination arm. Signs of subclonal reduction were observed, with some aberrations disappearing and others being retained during treatment. Increase in subclonal amplification was observed at 6p21.1, a locus which contains the VEGFA gene for the protein which are targeted by the study drug bevacizumab. Of the 13 pre-treatment samples that had a gain at VEGFA, 12 were responders. Significant decrease of frequency of subclones carrying gains at 17q21.32-q22 was observed at 12 weeks, with the peak occurring at TMEM100, an ALK1 receptor signaling-dependent gene essential for vasculogenesis. This implies that cells bearing amplifications of VEGFA and TMEM100 are particularly sensitive to this treatment regime. CONCLUSIONS: Taken together, these results suggest that heterogeneity and subclonal architecture influence the response to targeted treatment in combination with chemotherapy, with possible implications for clinical decision-making and monitoring of treatment efficacy. TRIAL REGISTRATION: NCT00773695 . Registered 15 October 2008.

Cooke D, Wedge D, Lunter G. 2018. A unified haplotype-based method for accurate and comprehensive variant calling | Show Abstract | Read more

Haplotype-based variant callers, which consider physical linkage between variant sites, are currently among the best tools for germline variation discovery and genotyping from short-read sequencing data. However, almost all such tools were designed specifically for detecting common germline variation in diploid populations, and give sub-optimal results in other scenarios. Here we present Octopus, a versatile haplotype-based variant caller that uses a polymorphic Bayesian genotyping model capable of modeling sequencing data from a range of experimental designs within a unified haplotype-aware framework. We show that Octopus accurately calls de novo mutations in parent-offspring trios and germline variants in individuals, including SNVs, indels, and small complex replacements such as microinversions. In addition, using a carefully designed synthetic-tumour data set derived from clean sequencing data from a sample with known germline haplotypes, and observed mutations in large cohort of tumour samples, we show that Octopus accurately characterizes germline and somatic variation in tumours, both with and without a paired normal sample. Sequencing reads and prior information are combined to phase called genotypes of arbitrary ploidy, including those with somatic mutations. Octopus also outputs realigned evidence BAMs to aid validation and interpretation.

Grinfeld J, Nangalia J, Baxter EJ, Wedge DC, Angelopoulos N, Cantrill R, Godfrey AL, Papaemmanuil E, Gundem G, MacLean C et al. 2018. Classification and Personalized Prognosis in Myeloproliferative Neoplasms. N Engl J Med, 379 (15), pp. 1416-1430. | Show Abstract | Read more

BACKGROUND: Myeloproliferative neoplasms, such as polycythemia vera, essential thrombocythemia, and myelofibrosis, are chronic hematologic cancers with varied progression rates. The genomic characterization of patients with myeloproliferative neoplasms offers the potential for personalized diagnosis, risk stratification, and treatment. METHODS: We sequenced coding exons from 69 myeloid cancer genes in patients with myeloproliferative neoplasms, comprehensively annotating driver mutations and copy-number changes. We developed a genomic classification for myeloproliferative neoplasms and multistage prognostic models for predicting outcomes in individual patients. Classification and prognostic models were validated in an external cohort. RESULTS: A total of 2035 patients were included in the analysis. A total of 33 genes had driver mutations in at least 5 patients, with mutations in JAK2, CALR, or MPL being the sole abnormality in 45% of the patients. The numbers of driver mutations increased with age and advanced disease. Driver mutations, germline polymorphisms, and demographic variables independently predicted whether patients received a diagnosis of essential thrombocythemia as compared with polycythemia vera or a diagnosis of chronic-phase disease as compared with myelofibrosis. We defined eight genomic subgroups that showed distinct clinical phenotypes, including blood counts, risk of leukemic transformation, and event-free survival. Integrating 63 clinical and genomic variables, we created prognostic models capable of generating personally tailored predictions of clinical outcomes in patients with chronic-phase myeloproliferative neoplasms and myelofibrosis. The predicted and observed outcomes correlated well in internal cross-validation of a training cohort and in an independent external cohort. Even within individual categories of existing prognostic schemas, our models substantially improved predictive accuracy. CONCLUSIONS: Comprehensive genomic characterization identified distinct genetic subgroups and provided a classification of myeloproliferative neoplasms on the basis of causal biologic mechanisms. Integration of genomic data with clinical variables enabled the personalized predictions of patients' outcomes and may support the treatment of patients with myeloproliferative neoplasms. (Funded by the Wellcome Trust and others.).

Cross W, Kovac M, Mustonen V, Temko D, Davis H, Baker A-M, Biswas S, Arnold R, Chegwidden L, Gatenbee C et al. 2018. The evolutionary landscape of colorectal tumorigenesis. Nat Ecol Evol, 2 (10), pp. 1661-1672. | Show Abstract | Read more

The evolutionary events that cause colorectal adenomas (benign) to progress to carcinomas (malignant) remain largely undetermined. Using multi-region genome and exome sequencing of 24 benign and malignant colorectal tumours, we investigate the evolutionary fitness landscape occupied by these neoplasms. Unlike carcinomas, advanced adenomas frequently harbour sub-clonal driver mutations-considered to be functionally important in the carcinogenic process-that have not swept to fixation, and have relatively high genetic heterogeneity. Carcinomas are distinguished from adenomas by widespread aneusomies that are usually clonal and often accrue in a 'punctuated' fashion. We conclude that adenomas evolve across an undulating fitness landscape, whereas carcinomas occupy a sharper fitness peak, probably owing to stabilizing selection.

Bolli N, Maura F, Minvielle S, Gloznik D, Szalat R, Fullam A, Martincorena I, Dawson KJ, Samur MK, Zamora J et al. 2018. Genomic patterns of progression in smoldering multiple myeloma. Nat Commun, 9 (1), pp. 3363. | Show Abstract | Read more

We analyzed whole genomes of unique paired samples from smoldering multiple myeloma (SMM) patients progressing to multiple myeloma (MM). We report that the genomic landscape, including mutational profile and structural rearrangements at the smoldering stage is very similar to MM. Paired sample analysis shows two different patterns of progression: a "static progression model", where the subclonal architecture is retained as the disease progressed to MM suggesting that progression solely reflects the time needed to accumulate a sufficient disease burden; and a "spontaneous evolution model", where a change in the subclonal composition is observed. We also observe that activation-induced cytidine deaminase plays a major role in shaping the mutational landscape of early subclinical phases, while progression is driven by APOBEC cytidine deaminases. These results provide a unique insight into myelomagenesis with potential implications for the definition of smoldering disease and timing of treatment initiation.

Li X, Francies HE, Secrier M, Perner J, Miremadi A, Galeano-Dalmau N, Barendt WJ, Letchford L, Leyden GM, Goffin EK et al. 2018. Organoid cultures recapitulate esophageal adenocarcinoma heterogeneity providing a model for clonality studies and precision therapeutics. Nat Commun, 9 (1), pp. 2983. | Show Abstract | Read more

Esophageal adenocarcinoma (EAC) incidence is increasing while 5-year survival rates remain less than 15%. A lack of experimental models has hampered progress. We have generated clinically annotated EAC organoid cultures that recapitulate the morphology, genomic, and transcriptomic landscape of the primary tumor including point mutations, copy number alterations, and mutational signatures. Karyotyping of organoid cultures has confirmed polyclonality reflecting the clonal architecture of the primary tumor. Furthermore, subclones underwent clonal selection associated with driver gene status. Medium throughput drug sensitivity testing demonstrates the potential of targeting receptor tyrosine kinases and downstream mediators. EAC organoid cultures provide a pre-clinical tool for studies of clonal evolution and precision therapeutics.

Wedge DC, Gundem G, Mitchell T, Woodcock DJ, Martincorena I, Ghori M, Zamora J, Butler A, Whitaker H, Kote-Jarai Z et al. 2018. Sequencing of prostate cancers identifies new cancer genes, routes of progression and drug targets. Nat Genet, 50 (5), pp. 682-692. | Show Abstract | Read more

Prostate cancer represents a substantial clinical challenge because it is difficult to predict outcome and advanced disease is often fatal. We sequenced the whole genomes of 112 primary and metastatic prostate cancer samples. From joint analysis of these cancers with those from previous studies (930 cancers in total), we found evidence for 22 previously unidentified putative driver genes harboring coding mutations, as well as evidence for NEAT1 and FOXA1 acting as drivers through noncoding mutations. Through the temporal dissection of aberrations, we identified driver mutations specifically associated with steps in the progression of prostate cancer, establishing, for example, loss of CHD1 and BRCA2 as early events in cancer development of ETS fusion-negative cancers. Computational chemogenomic (canSAR) analysis of prostate cancer mutations identified 11 targets of approved drugs, 7 targets of investigational drugs, and 62 targets of compounds that may be active and should be considered candidates for future clinical trials.

Mitchell TJ, Turajlic S, Rowan A, Nicol D, Farmery JHR, O'Brien T, Martincorena I, Tarpey P, Angelopoulos N, Yates LR et al. 2018. Timing the Landmark Events in the Evolution of Clear Cell Renal Cell Cancer: TRACERx Renal. Cell, 173 (3), pp. 611-623.e17. | Show Abstract | Read more

Clear cell renal cell carcinoma (ccRCC) is characterized by near-universal loss of the short arm of chromosome 3, deleting several tumor suppressor genes. We analyzed whole genomes from 95 biopsies across 33 patients with clear cell renal cell carcinoma. We find hotspots of point mutations in the 5' UTR of TERT, targeting a MYC-MAX-MAD1 repressor associated with telomere lengthening. The most common structural abnormality generates simultaneous 3p loss and 5q gain (36% patients), typically through chromothripsis. This event occurs in childhood or adolescence, generally as the initiating event that precedes emergence of the tumor's most recent common ancestor by years to decades. Similar genomic changes drive inherited ccRCC. Modeling differences in age incidence between inherited and sporadic cancers suggests that the number of cells with 3p loss capable of initiating sporadic tumors is no more than a few hundred. Early development of ccRCC follows well-defined evolutionary trajectories, offering opportunity for early intervention.

Cheng J, Demeulemeester J, Wedge DC, Vollan HKM, Pitt JJ, Russnes HG, Pandey BP, Nilsen G, Nord S, Bignell GR et al. 2017. Pan-cancer analysis of homozygous deletions in primary tumours uncovers rare tumour suppressors. Nat Commun, 8 (1), pp. 1221. | Show Abstract | Read more

Homozygous deletions are rare in cancers and often target tumour suppressor genes. Here, we build a compendium of 2218 primary tumours across 12 human cancer types and systematically screen for homozygous deletions, aiming to identify rare tumour suppressors. Our analysis defines 96 genomic regions recurrently targeted by homozygous deletions. These recurrent homozygous deletions occur either over tumour suppressors or over fragile sites, regions of increased genomic instability. We construct a statistical model that separates fragile sites from regions showing signatures of positive selection for homozygous deletions and identify candidate tumour suppressors within those regions. We find 16 established tumour suppressors and propose 27 candidate tumour suppressors. Several of these genes (including MGMT, RAD17, and USP44) show prior evidence of a tumour suppressive function. Other candidate tumour suppressors, such as MAFTRR, KIAA1551, and IGF2BP2, are novel. Our study demonstrates how rare tumour suppressors can be identified through copy number meta-analysis.

Camacho N, Van Loo P, Edwards S, Kay JD, Matthews L, Haase K, Clark J, Dennis N, Thomas S, Kremeyer B et al. 2017. Appraising the relevance of DNA copy number loss and gain in prostate cancer using whole genome DNA sequence data. PLoS Genet, 13 (9), pp. e1007001. | Show Abstract | Read more

A variety of models have been proposed to explain regions of recurrent somatic copy number alteration (SCNA) in human cancer. Our study employs Whole Genome DNA Sequence (WGS) data from tumor samples (n = 103) to comprehensively assess the role of the Knudson two hit genetic model in SCNA generation in prostate cancer. 64 recurrent regions of loss and gain were detected, of which 28 were novel, including regions of loss with more than 15% frequency at Chr4p15.2-p15.1 (15.53%), Chr6q27 (16.50%) and Chr18q12.3 (17.48%). Comprehensive mutation screens of genes, lincRNA encoding sequences, control regions and conserved domains within SCNAs demonstrated that a two-hit genetic model was supported in only a minor proportion of recurrent SCNA losses examined (15/40). We found that recurrent breakpoints and regions of inversion often occur within Knudson model SCNAs, leading to the identification of ZNF292 as a target gene for the deletion at 6q14.3-q15 and NKX3.1 as a two-hit target at 8p21.3-p21.2. The importance of alterations of lincRNA sequences was illustrated by the identification of a novel mutational hotspot at the KCCAT42, FENDRR, CAT1886 and STCAT2 loci at the 16q23.1-q24.3 loss. Our data confirm that the burden of SCNAs is predictive of biochemical recurrence, define nine individual regions that are associated with relapse, and highlight the possible importance of ion channel and G-protein coupled-receptor (GPCR) pathways in cancer development. We concluded that a two-hit genetic model accounts for about one third of SCNA indicating that mechanisms, such haploinsufficiency and epigenetic inactivation, account for the remaining SCNA losses.

Dentro SC, Wedge DC, Van Loo P. 2017. Principles of Reconstructing the Subclonal Architecture of Cancers. Cold Spring Harb Perspect Med, 7 (8), pp. a026625-a026625. | Show Abstract | Read more

Most cancers evolve from a single founder cell through a series of clonal expansions that are driven by somatic mutations. These clonal expansions can lead to several coexisting subclones sharing subsets of mutations. Analysis of massively parallel sequencing data can infer a tumor's subclonal composition through the identification of populations of cells with shared mutations. We describe the principles that underlie subclonal reconstruction through single nucleotide variants (SNVs) or copy number alterations (CNAs) from bulk or single-cell sequencing. These principles include estimating the fraction of tumor cells for SNVs and CNAs, performing clustering of SNVs from single- and multisample cases, and single-cell sequencing. The application of subclonal reconstruction methods is providing key insights into tumor evolution, identifying subclonal driver mutations, patterns of parallel evolution and differences in mutational signatures between cellular populations, and characterizing the mechanisms of therapy resistance, spread, and metastasis.

Yates LR, Knappskog S, Wedge D, Farmery JHR, Gonzalez S, Martincorena I, Alexandrov LB, Van Loo P, Haugland HK, Lilleng PK et al. 2017. Genomic Evolution of Breast Cancer Metastasis and Relapse. Cancer Cell, 32 (2), pp. 169-184.e7. | Show Abstract | Read more

Patterns of genomic evolution between primary and metastatic breast cancer have not been studied in large numbers, despite patients with metastatic breast cancer having dismal survival. We sequenced whole genomes or a panel of 365 genes on 299 samples from 170 patients with locally relapsed or metastatic breast cancer. Several lines of analysis indicate that clones seeding metastasis or relapse disseminate late from primary tumors, but continue to acquire mutations, mostly accessing the same mutational processes active in the primary tumor. Most distant metastases acquired driver mutations not seen in the primary tumor, drawing from a wider repertoire of cancer genes than early drivers. These include a number of clinically actionable alterations and mutations inactivating SWI-SNF and JAK2-STAT3 pathways.

Behjati S, Tarpey PS, Haase K, Ye H, Young MD, Alexandrov LB, Farndon SJ, Collord G, Wedge DC, Martincorena I et al. 2017. Recurrent mutation of IGF signalling genes and distinct patterns of genomic rearrangement in osteosarcoma. Nat Commun, 8 (1), pp. 15936. | Show Abstract | Read more

Osteosarcoma is a primary malignancy of bone that affects children and adults. Here, we present the largest sequencing study of osteosarcoma to date, comprising 112 childhood and adult tumours encompassing all major histological subtypes. A key finding of our study is the identification of mutations in insulin-like growth factor (IGF) signalling genes in 8/112 (7%) of cases. We validate this observation using fluorescence in situ hybridization (FISH) in an additional 87 osteosarcomas, with IGF1 receptor (IGF1R) amplification observed in 14% of tumours. These findings may inform patient selection in future trials of IGF1R inhibitors in osteosarcoma. Analysing patterns of mutation, we identify distinct rearrangement profiles including a process characterized by chromothripsis and amplification. This process operates recurrently at discrete genomic regions and generates driver mutations. It may represent an age-independent mutational mechanism that contributes to the development of osteosarcoma in children and adults alike.

Ju YS, Martincorena I, Gerstung M, Petljak M, Alexandrov LB, Rahbari R, Wedge DC, Davies HR, Ramakrishna M, Fullam A et al. 2017. Somatic mutations reveal asymmetric cellular dynamics in the early human embryo. Nature, 543 (7647), pp. 714-718. | Show Abstract | Read more

Somatic cells acquire mutations throughout the course of an individual's life. Mutations occurring early in embryogenesis are often present in a substantial proportion of, but not all, cells in postnatal humans and thus have particular characteristics and effects. Depending on their location in the genome and the proportion of cells they are present in, these mosaic mutations can cause a wide range of genetic disease syndromes and predispose carriers to cancer. They have a high chance of being transmitted to offspring as de novo germline mutations and, in principle, can provide insights into early human embryonic cell lineages and their contributions to adult tissues. Although it is known that gross chromosomal abnormalities are remarkably common in early human embryos, our understanding of early embryonic somatic mutations is very limited. Here we use whole-genome sequences of normal blood from 241 adults to identify 163 early embryonic mutations. We estimate that approximately three base substitution mutations occur per cell per cell-doubling event in early human embryogenesis and these are mainly attributable to two known mutational signatures. We used the mutations to reconstruct developmental lineages of adult cells and demonstrate that the two daughter cells of many early embryonic cell-doubling events contribute asymmetrically to adult blood at an approximately 2:1 ratio. This study therefore provides insights into the mutation rates, mutational processes and developmental outcomes of cell dynamics that operate during early human embryogenesis.

Grove CS, Bolli N, Manes N, Varela I, Van't Veer M, Bench A, Eldaly H, Wedge D, Van Loo P, Vassiliou GS. 2017. Rapid parallel acquisition of somatic mutations after NPM1 in acute myeloid leukaemia evolution. Br J Haematol, 176 (5), pp. 825-829. | Read more

Macintyre G, Van Loo P, Corcoran NM, Wedge DC, Markowetz F, Hovens CM. 2017. How Subclonal Modeling Is Changing the Metastatic Paradigm. Clin Cancer Res, 23 (3), pp. 630-635. | Show Abstract | Read more

A concerted effort to sequence matched primary and metastatic tumors is vastly improving our ability to understand metastasis in humans. Compelling evidence has emerged that supports the existence of diverse and surprising metastatic patterns. Enhancing these efforts is a new class of algorithms that facilitate high-resolution subclonal modeling of metastatic spread. Here we summarize how subclonal models of metastasis are influencing the metastatic paradigm. Clin Cancer Res; 23(3); 630-5. ©2016 AACR.

Demeulemeester J, Kumar P, Møller EK, Nord S, Wedge DC, Peterson A, Mathiesen RR, Fjelldal R, Zamani Esteki M, Theunis K et al. 2016. Tracing the origin of disseminated tumor cells in breast cancer using single-cell sequencing. Genome Biol, 17 (1), pp. 250. | Show Abstract | Read more

BACKGROUND: Single-cell micro-metastases of solid tumors often occur in the bone marrow. These disseminated tumor cells (DTCs) may resist therapy and lay dormant or progress to cause overt bone and visceral metastases. The molecular nature of DTCs remains elusive, as well as when and from where in the tumor they originate. Here, we apply single-cell sequencing to identify and trace the origin of DTCs in breast cancer. RESULTS: We sequence the genomes of 63 single cells isolated from six non-metastatic breast cancer patients. By comparing the cells' DNA copy number aberration (CNA) landscapes with those of the primary tumors and lymph node metastasis, we establish that 53% of the single cells morphologically classified as tumor cells are DTCs disseminating from the observed tumor. The remaining cells represent either non-aberrant "normal" cells or "aberrant cells of unknown origin" that have CNA landscapes discordant from the tumor. Further analyses suggest that the prevalence of aberrant cells of unknown origin is age-dependent and that at least a subset is hematopoietic in origin. Evolutionary reconstruction analysis of bulk tumor and DTC genomes enables ordering of CNA events in molecular pseudo-time and traced the origin of the DTCs to either the main tumor clone, primary tumor subclones, or subclones in an axillary lymph node metastasis. CONCLUSIONS: Single-cell sequencing of bone marrow epithelial-like cells, in parallel with intra-tumor genetic heterogeneity profiling from bulk DNA, is a powerful approach to identify and study DTCs, yielding insight into metastatic processes. A heterogeneous population of CNA-positive cells is present in the bone marrow of non-metastatic breast cancer patients, only part of which are derived from the observed tumor lineages.

Raine KM, Van Loo P, Wedge DC, Jones D, Menzies A, Butler AP, Teague JW, Tarpey P, Nik-Zainal S, Campbell PJ. 2016. ascatNgs: Identifying Somatically Acquired Copy-Number Alterations from Whole-Genome Sequencing Data. Curr Protoc Bioinformatics, 56 (1), pp. 15.9.1-15.9.17. | Show Abstract | Read more

We have developed ascatNgs to aid researchers in carrying out Allele-Specific Copy number Analysis of Tumours (ASCAT). ASCAT is capable of detecting DNA copy number changes affecting a tumor genome when comparing to a matched normal sample. Additionally, the algorithm estimates the amount of tumor DNA in the sample, known as Aberrant Cell Fraction (ACF). ASCAT itself is an R-package which requires the generation of many file types. Here, we present a suite of tools to help handle this for the user. Our code is available on our GitHub site (https://github.com/cancerit). This unit describes both 'one-shot' execution and approaches more suitable for large-scale compute farms. © 2016 by John Wiley & Sons, Inc.

Dimitriou M, Woll PS, Mortera-Blanco T, Karimi M, Wedge DC, Doolittle H, Douagi I, Papaemmanuil E, Jacobsen SEW, Hellström-Lindberg E. 2016. Perturbed hematopoietic stem and progenitor cell hierarchy in myelodysplastic syndromes patients with monosomy 7 as the sole cytogenetic abnormality. Oncotarget, 7 (45), pp. 72685-72698. | Show Abstract | Read more

The stem and progenitor cell compartments in low- and intermediate-risk myelodysplastic syndromes (MDS) have recently been described, and shown to be highly conserved when compared to those in acute myeloid leukemia (AML). Much less is known about the characteristics of the hematopoietic hierarchy of subgroups of MDS with a high risk of transforming to AML. Immunophenotypic analysis of immature stem and progenitor cell compartments from patients with an isolated loss of the entire chromosome 7 (isolated -7), an independent high-risk genetic event in MDS, showed expansion and dominance of the malignant -7 clone in the granulocyte and macrophage progenitors (GMP), and other CD45RA+ progenitor compartments, and a significant reduction of the LIN-CD34+CD38low/-CD90+CD45RA- hematopoietic stem cell (HSC) compartment, highly reminiscent of what is typically seen in AML, and distinct from low-risk MDS. Established functional in vitro and in vivo stem cell assays showed a poor readout for -7 MDS patients irrespective of marrow blast counts. Moreover, while the -7 clone dominated at all stages of GM differentiation, the -7 clone had a competitive disadvantage in erythroid differentiation. In azacitidine-treated -7 MDS patients with a clinical response, the decreased clonal involvement in mononuclear bone marrow cells was not accompanied by a parallel reduced clonal involvement in the dominant CD45RA+ progenitor populations, suggesting a selective azacitidine-resistance of these distinct -7 progenitor compartments. Our data demonstrate, in a subgroup of high risk MDS with monosomy 7, that the perturbed stem and progenitor cell compartments resemble more that of AML than low-risk MDS.

Behjati S, Gundem G, Wedge DC, Roberts ND, Tarpey PS, Cooke SL, Van Loo P, Alexandrov LB, Ramakrishna M, Davies H et al. 2016. Mutational signatures of ionizing radiation in second malignancies. Nat Commun, 7 (1), pp. 12605. | Show Abstract | Read more

Ionizing radiation is a potent carcinogen, inducing cancer through DNA damage. The signatures of mutations arising in human tissues following in vivo exposure to ionizing radiation have not been documented. Here, we searched for signatures of ionizing radiation in 12 radiation-associated second malignancies of different tumour types. Two signatures of somatic mutation characterize ionizing radiation exposure irrespective of tumour type. Compared with 319 radiation-naive tumours, radiation-associated tumours carry a median extra 201 deletions genome-wide, sized 1-100 base pairs often with microhomology at the junction. Unlike deletions of radiation-naive tumours, these show no variation in density across the genome or correlation with sequence context, replication timing or chromatin structure. Furthermore, we observe a significant increase in balanced inversions in radiation-associated tumours. Both small deletions and inversions generate driver mutations. Thus, ionizing radiation generates distinctive mutational signatures that explain its carcinogenic potential.

Sandhu V, Wedge DC, Bowitz Lothe IM, Labori KJ, Dentro SC, Buanes T, Skrede ML, Dalsgaard AM, Munthe E, Myklebost O et al. 2016. The Genomic Landscape of Pancreatic and Periampullary Adenocarcinoma. Cancer Res, 76 (17), pp. 5092-5102. | Show Abstract | Read more

Despite advances in diagnostics, less than 5% of patients with periampullary tumors experience an overall survival of five years or more. Periampullary tumors are neoplasms that arise in the vicinity of the ampulla of Vater, an enlargement of liver and pancreas ducts where they join and enter the small intestine. In this study, we analyzed copy number aberrations using Affymetrix SNP 6.0 arrays in 60 periampullary adenocarcinomas from Oslo University Hospital to identify genome-wide copy number aberrations, putative driver genes, deregulated pathways, and potential prognostic markers. Results were validated in a separate cohort derived from The Cancer Genome Atlas Consortium (n = 127). In contrast to many other solid tumors, periampullary adenocarcinomas exhibited more frequent genomic deletions than gains. Genes in the frequently codeleted region 17p13 and 18q21/22 were associated with cell cycle, apoptosis, and p53 and Wnt signaling. By integrating genomics and transcriptomics data from the same patients, we identified CCNE1 and ERBB2 as candidate driver genes. Morphologic subtypes of periampullary adenocarcinomas (i.e., pancreatobiliary or intestinal) harbor many common genomic aberrations. However, gain of 13q and 3q, and deletions of 5q were found specific to the intestinal subtype. Our study also implicated the use of the PAM50 classifier in identifying a subgroup of patients with a high proliferation rate, which had impaired survival. Furthermore, gain of 18p11 (18p11.21-23, 18p11.31-32) and 19q13 (19q13.2, 19q13.31-32) and subsequent overexpression of the genes in these loci were associated with impaired survival. Our work identifies potential prognostic markers for periampullary tumors, the genetic characterization of which has lagged. Cancer Res; 76(17); 5092-102. ©2016 AACR.

Shlien A, Raine K, Fuligni F, Arnold R, Nik-Zainal S, Dronov S, Mamanova L, Rosic A, Ju YS, Cooke SL et al. 2016. Direct Transcriptional Consequences of Somatic Mutation in Breast Cancer. Cell Rep, 16 (7), pp. 2032-2046. | Show Abstract | Read more

Disordered transcriptomes of cancer encompass direct effects of somatic mutation on transcription, coordinated secondary pathway alterations, and increased transcriptional noise. To catalog the rules governing how somatic mutation exerts direct transcriptional effects, we developed an exhaustive pipeline for analyzing RNA sequencing data, which we integrated with whole genomes from 23 breast cancers. Using X-inactivation analyses, we found that cancer cells are more transcriptionally active than intermixed stromal cells. This is especially true in estrogen receptor (ER)-negative tumors. Overall, 59% of substitutions were expressed. Nonsense mutations showed lower expression levels than expected, with patterns characteristic of nonsense-mediated decay. 14% of 4,234 rearrangements caused transcriptional abnormalities, including exon skips, exon reusage, fusions, and premature polyadenylation. We found productive, stable transcription from sense-to-antisense gene fusions and gene-to-intergenic rearrangements, suggesting that these mutation classes drive more transcriptional disruption than previously suspected. Systematic integration of transcriptome with genome data reveals the rules by which transcriptional machinery interprets somatic mutation.

Strakova A, Ní Leathlobhair M, Wang G-D, Yin T-T, Airikkala-Otter I, Allen JL, Allum KM, Bansse-Issa L, Bisson JL, Castillo Domracheva A et al. 2016. Mitochondrial genetic diversity, selection and recombination in a canine transmissible cancer. Elife, 5 (MAY2016), | Show Abstract | Read more

Canine transmissible venereal tumour (CTVT) is a clonally transmissible cancer that originated approximately 11,000 years ago and affects dogs worldwide. Despite the clonal origin of the CTVT nuclear genome, CTVT mitochondrial genomes (mtDNAs) have been acquired by periodic capture from transient hosts. We sequenced 449 complete mtDNAs from a global population of CTVTs, and show that mtDNA horizontal transfer has occurred at least five times, delineating five tumour clades whose distributions track two millennia of dog global migration. Negative selection has operated to prevent accumulation of deleterious mutations in captured mtDNA, and recombination has caused occasional mtDNA re-assortment. These findings implicate functional mtDNA as a driver of CTVT global metastatic spread, further highlighting the important role of mtDNA in cancer evolution.

Nik-Zainal S, Davies H, Staaf J, Ramakrishna M, Glodzik D, Zou X, Martincorena I, Alexandrov LB, Martin S, Wedge DC et al. 2016. Landscape of somatic mutations in 560 breast cancer whole-genome sequences. Nature, 534 (7605), pp. 47-54. | Show Abstract | Read more

We analysed whole-genome sequences of 560 breast cancers to advance understanding of the driver mutations conferring clonal advantage and the mutational processes generating somatic mutations. We found that 93 protein-coding cancer genes carried probable driver mutations. Some non-coding regions exhibited high mutation frequencies, but most have distinctive structural features probably causing elevated mutation rates and do not contain driver mutations. Mutational signature analysis was extended to genome rearrangements and revealed twelve base substitution and six rearrangement signatures. Three rearrangement signatures, characterized by tandem duplications or deletions, appear associated with defective homologous-recombination-based DNA repair: one with deficient BRCA1 function, another with deficient BRCA1 or BRCA2 function, the cause of the third is unknown. This analysis of all classes of somatic mutation across exons, introns and intergenic regions highlights the repertoire of cancer genes and mutational processes operating, and progresses towards a comprehensive account of the somatic genetic basis of breast cancer.

Alexandrov LB, Jones PH, Wedge DC, Sale JE, Campbell PJ, Nik-Zainal S, Stratton MR. 2015. Clock-like mutational processes in human somatic cells. Nat Genet, 47 (12), pp. 1402-1407. | Show Abstract | Read more

During the course of a lifetime, somatic cells acquire mutations. Different mutational processes may contribute to the mutations accumulated in a cell, with each imprinting a mutational signature on the cell's genome. Some processes generate mutations throughout life at a constant rate in all individuals, and the number of mutations in a cell attributable to these processes will be proportional to the chronological age of the person. Using mutations from 10,250 cancer genomes across 36 cancer types, we investigated clock-like mutational processes that have been operating in normal human cells. Two mutational signatures show clock-like properties. Both exhibit different mutation rates in different tissues. However, their mutation rates are not correlated, indicating that the underlying processes are subject to different biological influences. For one signature, the rate of cell division may influence its mutation rate. This study provides the first survey of clock-like mutational processes operating in human somatic cells.

Nangalia J, Nice FL, Wedge DC, Godfrey AL, Grinfeld J, Thakker C, Massie CE, Baxter J, Sewell D, Silber Y et al. 2015. DNMT3A mutations occur early or late in patients with myeloproliferative neoplasms and mutation order influences phenotype. Haematologica, 100 (11), pp. e438-e442. | Read more

Knappskog S, Berge EO, Chrisanthar R, Geisler S, Staalesen V, Leirvaag B, Yndestad S, de Faveri E, Karlsen BO, Wedge DC et al. 2015. Concomitant inactivation of the p53- and pRB- functional pathways predicts resistance to DNA damaging drugs in breast cancer in vivo. Mol Oncol, 9 (8), pp. 1553-1564. | Show Abstract | Read more

Chemoresistance is the main obstacle to cancer cure. Contrasting studies focusing on single gene mutations, we hypothesize chemoresistance to be due to inactivation of key pathways affecting cellular mechanisms such as apoptosis, senescence, or DNA repair. In support of this hypothesis, we have previously shown inactivation of either TP53 or its key activators CHK2 and ATM to predict resistance to DNA damaging drugs in breast cancer better than TP53 mutations alone. Further, we hypothesized that redundant pathway(s) may compensate for loss of p53-pathway signaling and that these are inactivated as well in resistant tumour cells. Here, we assessed genetic alterations of the retinoblastoma gene (RB1) and its key regulators: Cyclin D and E as well as their inhibitors p16 and p27. In an exploratory cohort of 69 patients selected from two prospective studies treated with either doxorubicin monotherapy or 5-FU and mitomycin for locally advanced breast cancers, we found defects in the pRB-pathway to be associated with therapy resistance (p-values ranging from 0.001 to 0.094, depending on the cut-off value applied to p27 expression levels). Although statistically weaker, we observed confirmatory associations in a validation cohort from another prospective study (n = 107 patients treated with neoadjuvant epirubicin monotherapy; p-values ranging from 7.0 × 10(-4) to 0.001 in the combined data sets). Importantly, inactivation of the p53-and the pRB-pathways in concert predicted resistance to therapy more strongly than each of the two pathways assessed individually (exploratory cohort: p-values ranging from 3.9 × 10(-6) to 7.5 × 10(-3) depending on cut-off values applied to ATM and p27 mRNA expression levels). Again, similar findings were confirmed in the validation cohort, with p-values ranging from 6.0 × 10(-7) to 6.5 × 10(-5) in the combined data sets. Our findings strongly indicate that concomitant inactivation of the p53- and pRB- pathways predict resistance towards anthracyclines and mitomycin in breast cancer in vivo.

Nik-Zainal S, Van Loo P, Wedge DC, Alexandrov LB, Greenman CD, Lau KW, Raine K, Jones D, Marshall J, Ramakrishna M et al. 2015. The Life History of 21 Breast Cancers (vol 149, pg 994, 2012) CELL, 162 (4), pp. 924-924. | Read more

Yates LR, Gerstung M, Knappskog S, Desmedt C, Gundem G, Van Loo P, Aas T, Alexandrov LB, Larsimont D, Davies H et al. 2015. Subclonal diversification of primary breast cancer revealed by multiregion sequencing. Nat Med, 21 (7), pp. 751-759. | Show Abstract | Read more

The sequencing of cancer genomes may enable tailoring of therapeutics to the underlying biological abnormalities driving a particular patient's tumor. However, sequencing-based strategies rely heavily on representative sampling of tumors. To understand the subclonal structure of primary breast cancer, we applied whole-genome and targeted sequencing to multiple samples from each of 50 patients' tumors (303 samples in total). The extent of subclonal diversification varied among cases and followed spatial patterns. No strict temporal order was evident, with point mutations and rearrangements affecting the most common breast cancer genes, including PIK3CA, TP53, PTEN, BRCA2 and MYC, occurring early in some tumors and late in others. In 13 out of 50 cancers, potentially targetable mutations were subclonal. Landmarks of disease progression, such as resistance to chemotherapy and the acquisition of invasive or metastatic potential, arose within detectable subclones of antecedent lesions. These findings highlight the importance of including analyses of subclonal structure and tumor evolution in clinical trials of primary breast cancer.

Cooper CS, Eeles R, Wedge DC, Van Loo P, Gundem G, Alexandrov LB, Kremeyer B, Butler A, Lynch AG, Camacho N et al. 2015. Corrigendum: analysis of the genetic phylogeny of multifocal prostate cancer identifies multiple independent clonal expansions in neoplastic and morphologically normal prostate tissue. Nat Genet, 47 (6), pp. 689. | Read more

Ju YS, Tubio JMC, Mifsud W, Fu B, Davies HR, Ramakrishna M, Li Y, Yates L, Gundem G, Tarpey PS et al. 2015. Frequent somatic transfer of mitochondrial DNA into the nuclear genome of human cancer cells. Genome Res, 25 (6), pp. 814-824. | Show Abstract | Read more

Mitochondrial genomes are separated from the nuclear genome for most of the cell cycle by the nuclear double membrane, intervening cytoplasm, and the mitochondrial double membrane. Despite these physical barriers, we show that somatically acquired mitochondrial-nuclear genome fusion sequences are present in cancer cells. Most occur in conjunction with intranuclear genomic rearrangements, and the features of the fusion fragments indicate that nonhomologous end joining and/or replication-dependent DNA double-strand break repair are the dominant mechanisms involved. Remarkably, mitochondrial-nuclear genome fusions occur at a similar rate per base pair of DNA as interchromosomal nuclear rearrangements, indicating the presence of a high frequency of contact between mitochondrial and nuclear DNA in some somatic cells. Transmission of mitochondrial DNA to the nuclear genome occurs in neoplastically transformed cells, but we do not exclude the possibility that some mitochondrial-nuclear DNA fusions observed in cancer occurred years earlier in normal somatic cells.

Martincorena I, Roshan A, Gerstung M, Ellis P, Van Loo P, McLaren S, Wedge DC, Fullam A, Alexandrov LB, Tubio JM et al. 2015. Tumor evolution. High burden and pervasive positive selection of somatic mutations in normal human skin. Science, 348 (6237), pp. 880-886. | Show Abstract | Read more

How somatic mutations accumulate in normal cells is central to understanding cancer development but is poorly understood. We performed ultradeep sequencing of 74 cancer genes in small (0.8 to 4.7 square millimeters) biopsies of normal skin. Across 234 biopsies of sun-exposed eyelid epidermis from four individuals, the burden of somatic mutations averaged two to six mutations per megabase per cell, similar to that seen in many cancers, and exhibited characteristic signatures of exposure to ultraviolet light. Remarkably, multiple cancer genes are under strong positive selection even in physiologically normal skin, including most of the key drivers of cutaneous squamous cell carcinomas. Positively selected mutations were found in 18 to 32% of normal skin cells at a density of ~140 driver mutations per square centimeter. We observed variability in the driver landscape among individuals and variability in the sizes of clonal expansions across genes. Thus, aged sun-exposed skin is a patchwork of thousands of evolving clones with over a quarter of cells carrying cancer-causing mutations while maintaining the physiological functions of epidermis.

Gundem G, Van Loo P, Kremeyer B, Alexandrov LB, Tubio JMC, Papaemmanuil E, Brewer DS, Kallio HML, Högnäs G, Annala M et al. 2015. The evolutionary history of lethal metastatic prostate cancer. Nature, 520 (7547), pp. 353-357. | Show Abstract | Read more

Cancers emerge from an ongoing Darwinian evolutionary process, often leading to multiple competing subclones within a single primary tumour. This evolutionary process culminates in the formation of metastases, which is the cause of 90% of cancer-related deaths. However, despite its clinical importance, little is known about the principles governing the dissemination of cancer cells to distant organs. Although the hypothesis that each metastasis originates from a single tumour cell is generally supported, recent studies using mouse models of cancer demonstrated the existence of polyclonal seeding from and interclonal cooperation between multiple subclones. Here we sought definitive evidence for the existence of polyclonal seeding in human malignancy and to establish the clonal relationship among different metastases in the context of androgen-deprived metastatic prostate cancer. Using whole-genome sequencing, we characterized multiple metastases arising from prostate tumours in ten patients. Integrated analyses of subclonal architecture revealed the patterns of metastatic spread in unprecedented detail. Metastasis-to-metastasis spread was found to be common, either through de novo monoclonal seeding of daughter metastases or, in five cases, through the transfer of multiple tumour clones between metastatic sites. Lesions affecting tumour suppressor genes usually occur as single events, whereas mutations in genes involved in androgen receptor signalling commonly involve multiple, convergent events in different metastases. Our results elucidate in detail the complex patterns of metastatic spread and further our understanding of the development of resistance to androgen-deprivation therapy in prostate cancer.

Hong MKH, Macintyre G, Wedge DC, Van Loo P, Patel K, Lunke S, Alexandrov LB, Sloggett C, Cmero M, Marass F et al. 2015. Tracking the origins and drivers of subclonal metastatic expansion in prostate cancer. Nat Commun, 6 (1), pp. 6605. | Show Abstract | Read more

Tumour heterogeneity in primary prostate cancer is a well-established phenomenon. However, how the subclonal diversity of tumours changes during metastasis and progression to lethality is poorly understood. Here we reveal the precise direction of metastatic spread across four lethal prostate cancer patients using whole-genome and ultra-deep targeted sequencing of longitudinally collected primary and metastatic tumours. We find one case of metastatic spread to the surgical bed causing local recurrence, and another case of cross-metastatic site seeding combining with dynamic remoulding of subclonal mixtures in response to therapy. By ultra-deep sequencing end-stage blood, we detect both metastatic and primary tumour clones, even years after removal of the prostate. Analysis of mutations associated with metastasis reveals an enrichment of TP53 mutations, and additional sequencing of metastases from 19 patients demonstrates that acquisition of TP53 mutations is linked with the expansion of subclones with metastatic potential which we can detect in the blood.

Woll PS, Kjaellquist U, Chowdhury O, Doolittle H, Wedge DC, Thongjuea S, Erlandsson R, Ngara M, Anderson K, Deng Q et al. 2015. Myelodysplastic Syndromes Are Propagated by Rare and Distinct Human Cancer Stem Cells In Vivo (vol 25, pg 794, 2014) CANCER CELL, 27 (4), pp. 603-605. | Read more

Cooper CS, Eeles R, Wedge DC, Van Loo P, Gundem G, Alexandrov LB, Kremeyer B, Butler A, Lynch AG, Camacho N et al. 2015. Analysis of the genetic phylogeny of multifocal prostate cancer identifies multiple independent clonal expansions in neoplastic and morphologically normal prostate tissue. Nat Genet, 47 (4), pp. 367-372. | Show Abstract | Read more

Genome-wide DNA sequencing was used to decrypt the phylogeny of multiple samples from distinct areas of cancer and morphologically normal tissue taken from the prostates of three men. Mutations were present at high levels in morphologically normal tissue distant from the cancer, reflecting clonal expansions, and the underlying mutational processes at work in morphologically normal tissue were also at work in cancer. Our observations demonstrate the existence of ongoing abnormal mutational processes, consistent with field effects, underlying carcinogenesis. This mechanism gives rise to extensive branching evolution and cancer clone mixing, as exemplified by the coexistence of multiple cancer lineages harboring distinct ERG fusions within a single cancer nodule. Subsets of mutations were shared either by morphologically normal and malignant tissues or between different ERG lineages, indicating earlier or separate clonal cell expansions. Our observations inform on the origin of multifocal disease and have implications for prostate cancer therapy in individual cases.

Presneau N, Baumhoer D, Behjati S, Pillay N, Tarpey P, Campbell PJ, Jundt G, Hamoudi R, Wedge DC, Loo PV et al. 2015. Diagnostic value of H3F3A mutations in giant cell tumour of bone compared to osteoclast-rich mimics. J Pathol Clin Res, 1 (2), pp. 113-123. | Show Abstract | Read more

Driver mutations in the two histone 3.3 (H3.3) genes, H3F3A and H3F3B, were recently identified by whole genome sequencing in 95% of chondroblastoma (CB) and by targeted gene sequencing in 92% of giant cell tumour of bone (GCT). Given the high prevalence of these driver mutations, it may be possible to utilise these alterations as diagnostic adjuncts in clinical practice. Here, we explored the spectrum of H3.3 mutations in a wide range and large number of bone tumours (n = 412) to determine if these alterations could be used to distinguish GCT from other osteoclast-rich tumours such as aneurysmal bone cyst, nonossifying fibroma, giant cell granuloma, and osteoclast-rich malignant bone tumours and others. In addition, we explored the driver landscape of GCT through whole genome, exome and targeted sequencing (14 gene panel). We found that H3.3 mutations, namely mutations of glycine 34 in H3F3A, occur in 96% of GCT. We did not find additional driver mutations in GCT, including mutations in IDH1, IDH2, USP6, TP53. The genomes of GCT exhibited few somatic mutations, akin to the picture seen in CB. Overall our observations suggest that the presence of H3F3A p.Gly34 mutations does not entirely exclude malignancy in osteoclast-rich tumours. However, H3F3A p.Gly34 mutations appear to be an almost essential feature of GCT that will aid pathological evaluation of bone tumours, especially when confronted with small needle core biopsies. In the absence of H3F3A p.Gly34 mutations, a diagnosis of GCT should be made with caution.

Shlien A, Campbell BB, de Borja R, Alexandrov LB, Merico D, Wedge D, Van Loo P, Tarpey PS, Coupland P, Behjati S et al. 2015. Combined hereditary and somatic mutations of replication error repair genes result in rapid onset of ultra-hypermutated cancers. Nat Genet, 47 (3), pp. 257-262. | Show Abstract | Read more

DNA replication-associated mutations are repaired by two components: polymerase proofreading and mismatch repair. The mutation consequences of disruption to both repair components in humans are not well studied. We sequenced cancer genomes from children with inherited biallelic mismatch repair deficiency (bMMRD). High-grade bMMRD brain tumors exhibited massive numbers of substitution mutations (>250/Mb), which was greater than all childhood and most cancers (>7,000 analyzed). All ultra-hypermutated bMMRD cancers acquired early somatic driver mutations in DNA polymerase ɛ or δ. The ensuing mutation signatures and numbers are unique and diagnostic of childhood germ-line bMMRD (P < 10(-13)). Sequential tumor biopsy analysis revealed that bMMRD/polymerase-mutant cancers rapidly amass an excess of simultaneous mutations (∼600 mutations/cell division), reaching but not exceeding ∼20,000 exonic mutations in <6 months. This implies a threshold compatible with cancer-cell survival. We suggest a new mechanism of cancer progression in which mutations develop in a rapid burst after ablation of replication repair.

Drogan D, Dunn WB, Lin W, Buijsse B, Schulze MB, Langenberg C, Brown M, Floegel A, Dietrich S, Rolandsson O et al. 2015. Untargeted metabolic profiling identifies altered serum metabolites of type 2 diabetes mellitus in a prospective, nested case control study. Clin Chem, 61 (3), pp. 487-497. | Show Abstract | Read more

BACKGROUND: Application of metabolite profiling could expand the etiological knowledge of type 2 diabetes mellitus (T2D). However, few prospective studies apply broad untargeted metabolite profiling to reveal the comprehensive metabolic alterations preceding the onset of T2D. METHODS: We applied untargeted metabolite profiling in serum samples obtained from the European Prospective Investigation into Cancer and Nutrition (EPIC)-Potsdam cohort comprising 300 individuals who developed T2D after a median follow-up time of 6 years and 300 matched controls. For that purpose, we used ultraperformance LC-MS with a protocol specifically designed for large-scale metabolomics studies with regard to robustness and repeatability. After multivariate classification to select metabolites with the strongest contribution to disease classification, we applied multivariable-adjusted conditional logistic regression to assess the association of these metabolites with T2D. RESULTS: Among several alterations in lipid metabolism, there was an inverse association with T2D for metabolites chemically annotated as lysophosphatidylcholine(dm16:0) and phosphatidylcholine(O-20:0/O-20:0). Hexose sugars were positively associated with T2D, whereas higher concentrations of a sugar alcohol and a deoxyhexose sugar reduced the odds of diabetes by approximately 60% and 70%, respectively. Furthermore, there was suggestive evidence for a positive association of the circulating purine nucleotide isopentenyladenosine-5'-monophosphate with incident T2D. CONCLUSIONS: This study constitutes one of the largest metabolite profiling approaches of T2D biomarkers in a prospective study population. The findings might help generate new hypotheses about diabetes etiology and develop further targeted studies of a smaller number of potentially important metabolites.

Ortmann CA, Kent DG, Nangalia J, Silber Y, Wedge DC, Grinfeld J, Baxter EJ, Massie CE, Papaemmanuil E, Menon S et al. 2015. Effect of mutation order on myeloproliferative neoplasms. N Engl J Med, 372 (7), pp. 601-612. | Show Abstract | Read more

BACKGROUND: Cancers result from the accumulation of somatic mutations, and their properties are thought to reflect the sum of these mutations. However, little is known about the effect of the order in which mutations are acquired. METHODS: We determined mutation order in patients with myeloproliferative neoplasms by genotyping hematopoietic colonies or by means of next-generation sequencing. Stem cells and progenitor cells were isolated to study the effect of mutation order on mature and immature hematopoietic cells. RESULTS: The age at which a patient presented with a myeloproliferative neoplasm, acquisition of JAK2 V617F homozygosity, and the balance of immature progenitors were all influenced by mutation order. As compared with patients in whom the TET2 mutation was acquired first (hereafter referred to as "TET2-first patients"), patients in whom the Janus kinase 2 (JAK2) mutation was acquired first ("JAK2-first patients") had a greater likelihood of presenting with polycythemia vera than with essential thrombocythemia, an increased risk of thrombosis, and an increased sensitivity of JAK2-mutant progenitors to ruxolitinib in vitro. Mutation order influenced the proliferative response to JAK2 V617F and the capacity of double-mutant hematopoietic cells and progenitor cells to generate colony-forming cells. Moreover, the hematopoietic stem-and-progenitor-cell compartment was dominated by TET2 single-mutant cells in TET2-first patients but by JAK2-TET2 double-mutant cells in JAK2-first patients. Prior mutation of TET2 altered the transcriptional consequences of JAK2 V617F in a cell-intrinsic manner and prevented JAK2 V617F from up-regulating genes associated with proliferation. CONCLUSIONS: The order in which JAK2 and TET2 mutations were acquired influenced clinical features, the response to targeted therapy, the biology of stem and progenitor cells, and clonal evolution in patients with myeloproliferative neoplasms. (Funded by Leukemia and Lymphoma Research and others.).

Cited:

25

European Pubmed Central

Rashid NU, Sperling AS, Bolli N, Wedge DC, Van Loo P, Tai Y-T, Shammas MA, Fulciniti M, Samur MK, Richardson PG et al. 2014. Differential and limited expression of mutant alleles in multiple myeloma. Blood, 124 (20), pp. 3110-3117. | Show Abstract | Read more

Recent work has delineated mutational profiles in multiple myeloma and reported a median of 52 mutations per patient, as well as a set of commonly mutated genes across multiple patients. In this study, we have used deep sequencing of RNA from a subset of these patients to evaluate the proportion of expressed mutations. We find that the majority of previously identified mutations occur within genes with very low or no detectable expression. On average, 27% (range, 11% to 47%) of mutated alleles are found to be expressed, and among mutated genes that are expressed, there often is allele-specific expression where either the mutant or wild-type allele is suppressed. Even in the absence of an overall change in gene expression, the presence of differential allelic expression within malignant cells highlights the important contribution of RNA-sequencing in identifying clinically significant mutational changes relevant to our understanding of myeloma biology and also for therapeutic applications.

Gromski PS, Correa E, Vaughan AA, Wedge DC, Turner ML, Goodacre R. 2014. A comparison of different chemometrics approaches for the robust classification of electronic nose data. Anal Bioanal Chem, 406 (29), pp. 7581-7590. | Show Abstract | Read more

Accurate detection of certain chemical vapours is important, as these may be diagnostic for the presence of weapons, drugs of misuse or disease. In order to achieve this, chemical sensors could be deployed remotely. However, the readout from such sensors is a multivariate pattern, and this needs to be interpreted robustly using powerful supervised learning methods. Therefore, in this study, we compared the classification accuracy of four pattern recognition algorithms which include linear discriminant analysis (LDA), partial least squares-discriminant analysis (PLS-DA), random forests (RF) and support vector machines (SVM) which employed four different kernels. For this purpose, we have used electronic nose (e-nose) sensor data (Wedge et al., Sensors Actuators B Chem 143:365-372, 2009). In order to allow direct comparison between our four different algorithms, we employed two model validation procedures based on either 10-fold cross-validation or bootstrapping. The results show that LDA (91.56% accuracy) and SVM with a polynomial kernel (91.66% accuracy) were very effective at analysing these e-nose data. These two models gave superior prediction accuracy, sensitivity and specificity in comparison to the other techniques employed. With respect to the e-nose sensor data studied here, our findings recommend that SVM with a polynomial kernel should be favoured as a classification method over the other statistical models that we assessed. SVM with non-linear kernels have the advantage that they can be used for classifying non-linear as well as linear mapping from analytical data space to multi-group classifications and would thus be a suitable algorithm for the analysis of most e-nose sensor data.

Zhang J, Fujimoto J, Zhang J, Wedge DC, Song X, Zhang J, Seth S, Chow C-W, Cao Y, Gumbs C et al. 2014. Intratumor heterogeneity in localized lung adenocarcinomas delineated by multiregion sequencing. Science, 346 (6206), pp. 256-259. | Show Abstract | Read more

Cancers are composed of populations of cells with distinct molecular and phenotypic features, a phenomenon termed intratumor heterogeneity (ITH). ITH in lung cancers has not been well studied. We applied multiregion whole-exome sequencing (WES) on 11 localized lung adenocarcinomas. All tumors showed clear evidence of ITH. On average, 76% of all mutations and 20 out of 21 known cancer gene mutations were identified in all regions of individual tumors, which suggested that single-region sequencing may be adequate to identify the majority of known cancer gene mutations in localized lung adenocarcinomas. With a median follow-up of 21 months after surgery, three patients have relapsed, and all three patients had significantly larger fractions of subclonal mutations in their primary tumors than patients without relapse. These data indicate that a larger subclonal mutation fraction may be associated with increased likelihood of postsurgical relapse in patients with localized lung adenocarcinomas.

de Bruin EC, McGranahan N, Mitter R, Salm M, Wedge DC, Yates L, Jamal-Hanjani M, Shafi S, Murugaesu N, Rowan AJ et al. 2014. Spatial and temporal diversity in genomic instability processes defines lung cancer evolution. Science, 346 (6206), pp. 251-256. | Show Abstract | Read more

Spatial and temporal dissection of the genomic changes occurring during the evolution of human non-small cell lung cancer (NSCLC) may help elucidate the basis for its dismal prognosis. We sequenced 25 spatially distinct regions from seven operable NSCLCs and found evidence of branched evolution, with driver mutations arising before and after subclonal diversification. There was pronounced intratumor heterogeneity in copy number alterations, translocations, and mutations associated with APOBEC cytidine deaminase activity. Despite maintained carcinogen exposure, tumors from smokers showed a relative decrease in smoking-related mutations over time, accompanied by an increase in APOBEC-associated mutations. In tumors from former smokers, genome-doubling occurred within a smoking-signature context before subclonal diversification, which suggested that a long period of tumor latency had preceded clinical detection. The regionally separated driver mutations, coupled with the relentless and heterogeneous nature of the genome instability processes, are likely to confound treatment success in NSCLC.

Knappskog S, Gansmo LB, Dibirova K, Metspalu A, Cybulski C, Peterlongo P, Aaltonen L, Vatten L, Romundstad P, Hveem K et al. 2014. Population distribution and ancestry of the cancer protective MDM2 SNP285 (rs117039649). Oncotarget, 5 (18), pp. 8223-8234. | Show Abstract | Read more

The MDM2 promoter SNP285C is located on the SNP309G allele. While SNP309G enhances Sp1 transcription factor binding and MDM2 transcription, SNP285C antagonizes Sp1 binding and reduces the risk of breast-, ovary- and endometrial cancer. Assessing SNP285 and 309 genotypes across 25 different ethnic populations (>10.000 individuals), the incidence of SNP285C was 6-8% across European populations except for Finns (1.2%) and Saami (0.3%). The incidence decreased towards the Middle-East and Eastern Russia, and SNP285C was absent among Han Chinese, Mongolians and African Americans. Interhaplotype variation analyses estimated SNP285C to have originated about 14,700 years ago (95% CI: 8,300 - 33,300). Both this estimate and the geographical distribution suggest SNP285C to have arisen after the separation between Caucasians and modern day East Asians (17,000 - 40,000 years ago). We observed a strong inverse correlation (r = -0.805; p < 0.001) between the percentage of SNP309G alleles harboring SNP285C and the MAF for SNP309G itself across different populations suggesting selection and environmental adaptation with respect to MDM2 expression in recent human evolution. In conclusion, we found SNP285C to be a pan-Caucasian variant. Ethnic variation regarding distribution of SNP285C needs to be taken into account when assessing the impact of MDM2 SNPs on cancer risk.

Behjati S, Huch M, van Boxtel R, Karthaus W, Wedge DC, Tamuri AU, Martincorena I, Petljak M, Alexandrov LB, Gundem G et al. 2014. Genome sequencing of normal cells reveals developmental lineages and mutational processes. Nature, 513 (7518), pp. 422-425. | Show Abstract | Read more

The somatic mutations present in the genome of a cell accumulate over the lifetime of a multicellular organism. These mutations can provide insights into the developmental lineage tree, the number of divisions that each cell has undergone and the mutational processes that have been operative. Here we describe whole genomes of clonal lines derived from multiple tissues of healthy mice. Using somatic base substitutions, we reconstructed the early cell divisions of each animal, demonstrating the contributions of embryonic cells to adult tissues. Differences were observed between tissues in the numbers and types of mutations accumulated by each cell, which likely reflect differences in the number of cell divisions they have undergone and varying contributions of different mutational processes. If somatic mutation rates are similar to those in mice, the results indicate that precise insights into development and mutagenesis of normal human cells will be possible.

Gobbi A, Iorio F, Dawson KJ, Wedge DC, Tamborero D, Alexandrov LB, Lopez-Bigas N, Garnett MJ, Jurman G, Saez-Rodriguez J. 2014. Fast randomization of large genomic datasets while preserving alteration counts. Bioinformatics, 30 (17), pp. i617-i623. | Show Abstract | Read more

MOTIVATION: Studying combinatorial patterns in cancer genomic datasets has recently emerged as a tool for identifying novel cancer driver networks. Approaches have been devised to quantify, for example, the tendency of a set of genes to be mutated in a 'mutually exclusive' manner. The significance of the proposed metrics is usually evaluated by computing P-values under appropriate null models. To this end, a Monte Carlo method (the switching-algorithm) is used to sample simulated datasets under a null model that preserves patient- and gene-wise mutation rates. In this method, a genomic dataset is represented as a bipartite network, to which Markov chain updates (switching-steps) are applied. These steps modify the network topology, and a minimal number of them must be executed to draw simulated datasets independently under the null model. This number has previously been deducted empirically to be a linear function of the total number of variants, making this process computationally expensive. RESULTS: We present a novel approximate lower bound for the number of switching-steps, derived analytically. Additionally, we have developed the R package BiRewire, including new efficient implementations of the switching-algorithm. We illustrate the performances of BiRewire by applying it to large real cancer genomics datasets. We report vast reductions in time requirement, with respect to existing implementations/bounds and equivalent P-value computations. Thus, we propose BiRewire to study statistical properties in genomic datasets, and other data that can be modeled as bipartite networks. AVAILABILITY AND IMPLEMENTATION: BiRewire is available on BioConductor at http://www.bioconductor.org/packages/2.13/bioc/html/BiRewire.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Tubio JMC, Li Y, Ju YS, Martincorena I, Cooke SL, Tojo M, Gundem G, Pipinikas CP, Zamora J, Raine K et al. 2014. Mobile DNA in cancer. Extensive transduction of nonrepetitive DNA mediated by L1 retrotransposition in cancer genomes. Science, 345 (6196), pp. 1251343. | Show Abstract | Read more

Long interspersed nuclear element-1 (L1) retrotransposons are mobile repetitive elements that are abundant in the human genome. L1 elements propagate through RNA intermediates. In the germ line, neighboring, nonrepetitive sequences are occasionally mobilized by the L1 machinery, a process called 3' transduction. Because 3' transductions are potentially mutagenic, we explored the extent to which they occur somatically during tumorigenesis. Studying cancer genomes from 244 patients, we found that tumors from 53% of the patients had somatic retrotranspositions, of which 24% were 3' transductions. Fingerprinting of donor L1s revealed that a handful of source L1 elements in a tumor can spawn from tens to hundreds of 3' transductions, which can themselves seed further retrotranspositions. The activity of individual L1 elements fluctuated during tumor evolution and correlated with L1 promoter hypomethylation. The 3' transductions disseminated genes, exons, and regulatory elements to new locations, most often to heterochromatic regions of the genome.

De Bruin E, McGranahan N, Salm M, Wedge D, Mitter R, Yates L, Matthews N, Stewart A, Campbell P, Swanton C. 2014. Intra-tumour heterogeneity in early-stage lung cancer inferred by multi-region sequencing EUROPEAN JOURNAL OF CANCER, 50 pp. S4-S4. | Read more

Wedge DC, Gundem G, Van Loo P, Brewer D, Leinonen K, Eeles R, Cooper C, Visakorpi T, McDermott U, Bova GS. 2014. Proffered Paper: The life history of lethal metastatic prostate cancer (The UK prostate cancer working group of the International Cancer Genome Consortium) EUROPEAN JOURNAL OF CANCER, 50 pp. S4-S4. | Read more

Cited:

139

Scopus

Woll PS, Kjällquist U, Chowdhury O, Doolittle H, Wedge DC, Thongjuea S, Erlandsson R, Ngara M, Anderson K, Deng Q et al. 2014. Myelodysplastic syndromes are propagated by rare and distinct human cancer stem cells in vivo Cancer Cell, 25 (6), pp. 794-808. | Show Abstract | Read more

Evidence for distinct human cancer stem cells (CSCs) remains contentious and the degree to which differentcancer cells contribute to propagating malignancies in patients remains unexplored. In low- to intermediate-risk myelodysplastic syndromes (MDS), we establish the existence of rare multipotent MDS stem cells (MDS-SCs), and their hierarchical relationship to lineage-restricted MDS progenitors. All identified somatically acquired genetic lesions were backtracked to distinct MDS-SCs, establishing their distinct MDS-propagating function invivo. In isolated del(5q)-MDS, acquisition of del(5q) preceded diverse recurrent driver mutations. Sequential analysis in del(5q)-MDS revealed genetic evolution in MDS-SCs and MDS-progenitors prior to leukemic transformation. These findings provide definitive evidence for rare human MDS-SCs invivo, with extensive implications for the targeting of the cells required and sufficient for MDS-propagation. © 2014 Elsevier Inc.

Woll PS, Kjaellquist U, Chowdhury O, Doolittle H, Wedge DC, Thongjuea S, Erlandsson R, Ngara M, Anderson K, Deng Q et al. 2014. Myelodysplastic Syndromes Are Propagated by Rare and Distinct Human Cancer Stem Cells In Vivo (vol 25, pg 794, 2014) CANCER CELL, 25 (6), pp. 861-861. | Read more

Woll PS, Kjällquist U, Chowdhury O, Doolittle H, Wedge DC, Thongjuea S, Erlandsson R, Ngara M, Anderson K, Deng Q et al. 2014. Myelodysplastic syndromes are propagated by rare and distinct human cancer stem cells in vivo. Cancer Cell, 25 (6), pp. 794-808. | Show Abstract | Read more

Evidence for distinct human cancer stem cells (CSCs) remains contentious and the degree to which different cancer cells contribute to propagating malignancies in patients remains unexplored. In low- to intermediate-risk myelodysplastic syndromes (MDS), we establish the existence of rare multipotent MDS stem cells (MDS-SCs), and their hierarchical relationship to lineage-restricted MDS progenitors. All identified somatically acquired genetic lesions were backtracked to distinct MDS-SCs, establishing their distinct MDS-propagating function in vivo. In isolated del(5q)-MDS, acquisition of del(5q) preceded diverse recurrent driver mutations. Sequential analysis in del(5q)-MDS revealed genetic evolution in MDS-SCs and MDS-progenitors prior to leukemic transformation. These findings provide definitive evidence for rare human MDS-SCs in vivo, with extensive implications for the targeting of the cells required and sufficient for MDS-propagation.

Nik-Zainal S, Wedge DC, Alexandrov LB, Petljak M, Butler AP, Bolli N, Davies HR, Knappskog S, Martin S, Papaemmanuil E et al. 2014. Association of a germline copy number polymorphism of APOBEC3A and APOBEC3B with burden of putative APOBEC-dependent mutations in breast cancer. Nat Genet, 46 (5), pp. 487-491. | Show Abstract | Read more

The somatic mutations in a cancer genome are the aggregate outcome of one or more mutational processes operative through the lifetime of the individual with cancer. Each mutational process leaves a characteristic mutational signature determined by the mechanisms of DNA damage and repair that constitute it. A role was recently proposed for the APOBEC family of cytidine deaminases in generating particular genome-wide mutational signatures and a signature of localized hypermutation called kataegis. A germline copy number polymorphism involving APOBEC3A and APOBEC3B, which effectively deletes APOBEC3B, has been associated with modestly increased risk of breast cancer. Here we show that breast cancers in carriers of the deletion show more mutations of the putative APOBEC-dependent genome-wide signatures than cancers in non-carriers. The results suggest that the APOBEC3A-APOBEC3B germline deletion allele confers cancer susceptibility through increased activity of APOBEC-dependent mutational processes, although the mechanism by which this increase in activity occurs remains unknown.

Cited:

114

Scopus

Behjati S, Tarpey PS, Sheldon H, Martincorena I, Van Loo P, Gundem G, Wedge DC, Ramakrishna M, Cooke SL, Pillay N et al. 2014. Recurrent PTPRB and PLCG1 mutations in angiosarcoma Nature Genetics, 46 (4), pp. 376-379. | Show Abstract | Read more

Angiosarcoma is an aggressive malignancy that arises spontaneously or secondarily to ionizing radiation or chronic lymphoedema. Previous work has identified aberrant angiogenesis, including occasional somatic mutations in angiogenesis signaling genes, as a key driver of angiosarcoma. Here we employed whole-genome, whole-exome and targeted sequencing to study the somatic changes underpinning primary and secondary angiosarcoma. We identified recurrent mutations in two genes, PTPRB and PLCG1, which are intimately linked to angiogenesis. The endothelial phosphatase PTPRB, a negative regulator of vascular growth factor tyrosine kinases, harbored predominantly truncating mutations in 10 of 39 tumors (26%). PLCG1, a signal transducer of tyrosine kinases, encoded a recurrent, likely activating p.Arg707Gln missense variant in 3 of 34 cases (9%). Overall, 15 of 39 tumors (38%) harbored at least one driver mutation in angiogenesis signaling genes. Our findings inform and reinforce current therapeutic efforts to target angiogenesis signaling in angiosarcoma. © 2014 Nature America, Inc.

Behjati S, Tarpey PS, Sheldon H, Martincorena I, Van Loo P, Gundem G, Wedge DC, Ramakrishna M, Cooke SL, Pillay N et al. 2014. Recurrent PTPRB and PLCG1 mutations in angiosarcoma. Nat Genet, 46 (4), pp. 376-379. | Show Abstract | Read more

Angiosarcoma is an aggressive malignancy that arises spontaneously or secondarily to ionizing radiation or chronic lymphoedema. Previous work has identified aberrant angiogenesis, including occasional somatic mutations in angiogenesis signaling genes, as a key driver of angiosarcoma. Here we employed whole-genome, whole-exome and targeted sequencing to study the somatic changes underpinning primary and secondary angiosarcoma. We identified recurrent mutations in two genes, PTPRB and PLCG1, which are intimately linked to angiogenesis. The endothelial phosphatase PTPRB, a negative regulator of vascular growth factor tyrosine kinases, harbored predominantly truncating mutations in 10 of 39 tumors (26%). PLCG1, a signal transducer of tyrosine kinases, encoded a recurrent, likely activating p.Arg707Gln missense variant in 3 of 34 cases (9%). Overall, 15 of 39 tumors (38%) harbored at least one driver mutation in angiogenesis signaling genes. Our findings inform and reinforce current therapeutic efforts to target angiogenesis signaling in angiosarcoma.

Behjati S, Tarpey PS, Presneau N, Scheipl S, Pillay N, Van Loo P, Wedge DC, Cooke SL, Gundem G, Davies H et al. 2014. Distinct H3F3A and H3F3B driver mutations define chondroblastoma and giant cell tumor of bone (vol 45, pg 1479, 2013) NATURE GENETICS, 46 (3), pp. 316-316. | Read more

Papaemmanuil E, Rapado I, Li Y, Potter NE, Wedge DC, Tubio J, Alexandrov LB, Van Loo P, Cooke SL, Marshall J et al. 2014. RAG-mediated recombination is the predominant driver of oncogenic rearrangement in ETV6-RUNX1 acute lymphoblastic leukemia. Nat Genet, 46 (2), pp. 116-125. | Show Abstract | Read more

The ETV6-RUNX1 fusion gene, found in 25% of childhood acute lymphoblastic leukemia (ALL) cases, is acquired in utero but requires additional somatic mutations for overt leukemia. We used exome and low-coverage whole-genome sequencing to characterize secondary events associated with leukemic transformation. RAG-mediated deletions emerge as the dominant mutational process, characterized by recombination signal sequence motifs near breakpoints, incorporation of non-templated sequence at junctions, ∼30-fold enrichment at promoters and enhancers of genes actively transcribed in B cell development and an unexpectedly high ratio of recurrent to non-recurrent structural variants. Single-cell tracking shows that this mechanism is active throughout leukemic evolution, with evidence of localized clustering and reiterated deletions. Integration of data on point mutations and rearrangements identifies ATF7IP and MGA as two new tumor-suppressor genes in ALL. Thus, a remarkably parsimonious mutational process transforms ETV6-RUNX1-positive lymphoblasts, targeting the promoters, enhancers and first exons of genes that normally regulate B cell differentiation.

Murchison EP, Wedge DC, Alexandrov LB, Fu B, Martincorena I, Ning Z, Tubio JMC, Werner EI, Allen J, De Nardi AB et al. 2014. Transmissible [corrected] dog cancer genome reveals the origin and history of an ancient cell lineage. Science, 343 (6169), pp. 437-440. | Show Abstract | Read more

Canine transmissible venereal tumor (CTVT) is the oldest known somatic cell lineage. It is a transmissible cancer that propagates naturally in dogs. We sequenced the genomes of two CTVT tumors and found that CTVT has acquired 1.9 million somatic substitution mutations and bears evidence of exposure to ultraviolet light. CTVT is remarkably stable and lacks subclonal heterogeneity despite thousands of rearrangements, copy-number changes, and retrotransposon insertions. More than 10,000 genes carry nonsynonymous variants, and 646 genes have been lost. CTVT first arose in a dog with low genomic heterozygosity that may have lived about 11,000 years ago. The cancer spawned by this individual dispersed across continents about 500 years ago. Our results provide a genetic identikit of an ancient dog and demonstrate the robustness of mammalian somatic cells to survive for millennia despite a massive mutation burden.

Bolli N, Avet-Loiseau H, Wedge DC, Van Loo P, Alexandrov LB, Martincorena I, Dawson KJ, Iorio F, Nik-Zainal S, Bignell GR et al. 2014. Heterogeneity of genomic evolution and mutational profiles in multiple myeloma. Nat Commun, 5 (1), pp. 2997. | Show Abstract | Read more

Multiple myeloma is an incurable plasma cell malignancy with a complex and incompletely understood molecular pathogenesis. Here we use whole-exome sequencing, copy-number profiling and cytogenetics to analyse 84 myeloma samples. Most cases have a complex subclonal structure and show clusters of subclonal variants, including subclonal driver mutations. Serial sampling reveals diverse patterns of clonal evolution, including linear evolution, differential clonal response and branching evolution. Diverse processes contribute to the mutational repertoire, including kataegis and somatic hypermutation, and their relative contribution changes over time. We find heterogeneity of mutational spectrum across samples, with few recurrent genes. We identify new candidate genes, including truncations of SP140, LTB, ROBO1 and clustered missense mutations in EGR1. The myeloma genome is heterogeneous across the cohort, and exhibits diversity in clonal admixture and in dynamics of evolution, which may impact prognostic stratification, therapeutic approaches and assessment of disease response to treatment.

Nangalia J, Massie CE, Baxter EJ, Nice FL, Gundem G, Wedge DC, Avezov E, Li J, Kollmann K, Kent DG et al. 2013. Somatic CALR mutations in myeloproliferative neoplasms with nonmutated JAK2. N Engl J Med, 369 (25), pp. 2391-2405. | Show Abstract | Read more

BACKGROUND: Somatic mutations in the Janus kinase 2 gene (JAK2) occur in many myeloproliferative neoplasms, but the molecular pathogenesis of myeloproliferative neoplasms with nonmutated JAK2 is obscure, and the diagnosis of these neoplasms remains a challenge. METHODS: We performed exome sequencing of samples obtained from 151 patients with myeloproliferative neoplasms. The mutation status of the gene encoding calreticulin (CALR) was assessed in an additional 1345 hematologic cancers, 1517 other cancers, and 550 controls. We established phylogenetic trees using hematopoietic colonies. We assessed calreticulin subcellular localization using immunofluorescence and flow cytometry. RESULTS: Exome sequencing identified 1498 mutations in 151 patients, with medians of 6.5, 6.5, and 13.0 mutations per patient in samples of polycythemia vera, essential thrombocythemia, and myelofibrosis, respectively. Somatic CALR mutations were found in 70 to 84% of samples of myeloproliferative neoplasms with nonmutated JAK2, in 8% of myelodysplasia samples, in occasional samples of other myeloid cancers, and in none of the other cancers. A total of 148 CALR mutations were identified with 19 distinct variants. Mutations were located in exon 9 and generated a +1 base-pair frameshift, which would result in a mutant protein with a novel C-terminal. Mutant calreticulin was observed in the endoplasmic reticulum without increased cell-surface or Golgi accumulation. Patients with myeloproliferative neoplasms carrying CALR mutations presented with higher platelet counts and lower hemoglobin levels than patients with mutated JAK2. Mutation of CALR was detected in hematopoietic stem and progenitor cells. Clonal analyses showed CALR mutations in the earliest phylogenetic node, a finding consistent with its role as an initiating mutation in some patients. CONCLUSIONS: Somatic mutations in the endoplasmic reticulum chaperone CALR were found in a majority of patients with myeloproliferative neoplasms with nonmutated JAK2. (Funded by the Kay Kendall Leukaemia Fund and others.).

Behjati S, Tarpey PS, Presneau N, Scheipl S, Pillay N, Van Loo P, Wedge DC, Cooke SL, Gundem G, Davies H et al. 2013. Distinct H3F3A and H3F3B driver mutations define chondroblastoma and giant cell tumor of bone. Nat Genet, 45 (12), pp. 1479-1482. | Show Abstract | Read more

It is recognized that some mutated cancer genes contribute to the development of many cancer types, whereas others are cancer type specific. For genes that are mutated in multiple cancer classes, mutations are usually similar in the different affected cancer types. Here, however, we report exquisite tumor type specificity for different histone H3.3 driver alterations. In 73 of 77 cases of chondroblastoma (95%), we found p.Lys36Met alterations predominantly encoded in H3F3B, which is one of two genes for histone H3.3. In contrast, in 92% (49/53) of giant cell tumors of bone, we found histone H3.3 alterations exclusively in H3F3A, leading to p.Gly34Trp or, in one case, p.Gly34Leu alterations. The mutations were restricted to the stromal cell population and were not detected in osteoclasts or their precursors. In the context of previously reported H3F3A mutations encoding p.Lys27Met and p.Gly34Arg or p.Gly34Val alterations in childhood brain tumors, a remarkable picture of tumor type specificity for histone H3.3 driver alterations emerges, indicating that histone H3.3 residues, mutations and genes have distinct functions.

Yen J, White RM, Wedge DC, Van Loo P, de Ridder J, Capper A, Richardson J, Jones D, Raine K, Watson IR et al. 2013. The genetic heterogeneity and mutational burden of engineered melanomas in zebrafish models. Genome Biol, 14 (10), pp. R113. | Show Abstract | Read more

BACKGROUND: Melanoma is the most deadly form of skin cancer. Expression of oncogenic BRAF or NRAS, which are frequently mutated in human melanomas, promote the formation of nevi but are not sufficient for tumorigenesis. Even with germline mutated p53, these engineered melanomas present with variable onset and pathology, implicating additional somatic mutations in a multi-hit tumorigenic process. RESULTS: To decipher the genetics of these melanomas, we sequence the protein coding exons of 53 primary melanomas generated from several BRAF(V600E) or NRAS(Q61K) driven transgenic zebrafish lines. We find that engineered zebrafish melanomas show an overall low mutation burden, which has a strong, inverse association with the number of initiating germline drivers. Although tumors reveal distinct mutation spectrums, they show mostly C > T transitions without UV light exposure, and enrichment of mutations in melanogenesis, p53 and MAPK signaling. Importantly, a recurrent amplification occurring with pre-configured drivers BRAF(V600E) and p53-/- suggests a novel path of BRAF cooperativity through the protein kinase A pathway. CONCLUSION: This is the first analysis of a melanoma mutational landscape in the absence of UV light, where tumors manifest with remarkably low mutation burden and high heterogeneity. Genotype specific amplification of protein kinase A in cooperation with BRAF and p53 mutation suggests the involvement of melanogenesis in these tumors. This work is important for defining the spectrum of events in BRAF or NRAS driven melanoma in the absence of UV light, and for informed exploitation of models such as transgenic zebrafish to better understand mechanisms leading to human melanoma formation.

Cited:

29

WOS

Alexandrov LB, Nik-Zainal S, Wedge DC, Aparicio SAJR, Behjati S, Biankin AV, Bignell GR, Bolli N, Borg A, Borresen-Dale A-L et al. 2013. Signatures of mutational processes in human cancer (vol 500, pg 415, 2013) NATURE, 502 (7470), pp. 258-258. | Read more

Papaemmanuil E, Gerstung M, Malcovati L, Tauro S, Gundem G, Van Loo P, Yoon CJ, Ellis P, Wedge DC, Pellagatti A et al. 2013. Clinical and biological implications of driver mutations in myelodysplastic syndromes. Blood, 122 (22), pp. 3616-3627. | Show Abstract | Read more

Myelodysplastic syndromes (MDS) are a heterogeneous group of chronic hematological malignancies characterized by dysplasia, ineffective hematopoiesis and a variable risk of progression to acute myeloid leukemia. Sequencing of MDS genomes has identified mutations in genes implicated in RNA splicing, DNA modification, chromatin regulation, and cell signaling. We sequenced 111 genes across 738 patients with MDS or closely related neoplasms (including chronic myelomonocytic leukemia and MDS-myeloproliferative neoplasms) to explore the role of acquired mutations in MDS biology and clinical phenotype. Seventy-eight percent of patients had 1 or more oncogenic mutations. We identify complex patterns of pairwise association between genes, indicative of epistatic interactions involving components of the spliceosome machinery and epigenetic modifiers. Coupled with inferences on subclonal mutations, these data suggest a hypothesis of genetic "predestination," in which early driver mutations, typically affecting genes involved in RNA splicing, dictate future trajectories of disease evolution with distinct clinical phenotypes. Driver mutations had equivalent prognostic significance, whether clonal or subclonal, and leukemia-free survival deteriorated steadily as numbers of driver mutations increased. Thus, analysis of oncogenic mutations in large, well-characterized cohorts of patients illustrates the interconnections between the cancer genome and disease biology, with considerable potential for clinical application.

Alexandrov LB, Nik-Zainal S, Wedge DC, Aparicio SAJR, Behjati S, Biankin AV, Bignell GR, Bolli N, Borg A, Børresen-Dale A-L et al. 2013. Signatures of mutational processes in human cancer. Nature, 500 (7463), pp. 415-421. | Show Abstract | Read more

All cancers are caused by somatic mutations; however, understanding of the biological processes generating these mutations is limited. The catalogue of somatic mutations from a cancer genome bears the signatures of the mutational processes that have been operative. Here we analysed 4,938,362 mutations from 7,042 cancers and extracted more than 20 distinct mutational signatures. Some are present in many cancer types, notably a signature attributed to the APOBEC family of cytidine deaminases, whereas others are confined to a single cancer class. Certain signatures are associated with age of the patient at cancer diagnosis, known mutagenic exposures or defects in DNA maintenance, but many are of cryptic origin. In addition to these genome-wide mutational signatures, hypermutation localized to small genomic regions, 'kataegis', is found in many cancer types. The results reveal the diversity of mutational processes underlying the development of cancer, with potential implications for understanding of cancer aetiology, prevention and therapy.

Tarpey PS, Behjati S, Cooke SL, Van Loo P, Wedge DC, Pillay N, Marshall J, O'Meara S, Davies H, Nik-Zainal S et al. 2013. Frequent mutation of the major cartilage collagen gene COL2A1 in chondrosarcoma. Nat Genet, 45 (8), pp. 923-926. | Show Abstract | Read more

Chondrosarcoma is a heterogeneous collection of malignant bone tumors and is the second most common primary malignancy of bone after osteosarcoma. Recent work has identified frequent, recurrent mutations in IDH1 or IDH2 in nearly half of central chondrosarcomas. However, there has been little systematic genomic analysis of this tumor type, and, thus, the contribution of other genes is unclear. Here we report comprehensive genomic analyses of 49 individuals with chondrosarcoma (cases). We identified hypermutability of the major cartilage collagen gene COL2A1, with insertions, deletions and rearrangements identified in 37% of cases. The patterns of mutation were consistent with selection for variants likely to impair normal collagen biosynthesis. In addition, we identified mutations in IDH1 or IDH2 (59%), TP53 (20%), the RB1 pathway (33%) and Hedgehog signaling (18%).

Stephens PJ, Davies HR, Mitani Y, Van Loo P, Shlien A, Tarpey PS, Papaemmanuil E, Cheverton A, Bignell GR, Butler AP et al. 2013. Whole exome sequencing of adenoid cystic carcinoma. J Clin Invest, 123 (7), pp. 2965-2968. | Show Abstract | Read more

Adenoid cystic carcinoma (ACC) is a rare malignancy that can occur in multiple organ sites and is primarily found in the salivary gland. While the identification of recurrent fusions of the MYB-NFIB genes have begun to shed light on the molecular underpinnings, little else is known about the molecular genetics of this frequently fatal cancer. We have undertaken exome sequencing in a series of 24 ACC to further delineate the genetics of the disease. We identified multiple mutated genes that, combined, implicate chromatin deregulation in half of cases. Further, mutations were identified in known cancer genes, including PIK3CA, ATM, CDKN2A, SF3B1, SUFU, TSC1, and CYLD. Mutations in NOTCH1/2 were identified in 3 cases, and we identify the negative NOTCH signaling regulator, SPEN, as a new cancer gene in ACC with mutations in 5 cases. Finally, the identification of 3 likely activating mutations in the tyrosine kinase receptor FGFR2, analogous to those reported in ovarian and endometrial carcinoma, point to potential therapeutic avenues for a subset of cases.

Nik-Zainal S, Alexandrov L, Wedge D, Van Loo P, Raine K, Jones D, Campbelland P, Stratton M. 2013. Kataegis and other somatic mutational signatures in cancer CHROMOSOME RESEARCH, 21 pp. S13-S13.

Cited:

56

Scopus

Argyri AA, Jarvis RM, Wedge D, Xu Y, Panagou EZ, Goodacre R, Nychas G-JE. 2013. A comparison of Raman and FT-IR spectroscopy for the prediction of meat spoilage FOOD CONTROL, 29 (2), pp. 461-470. | Show Abstract | Read more

In this study, time series spectroscopic, microbiological and sensory analysis data were obtained from minced beef samples stored under different packaging conditions (aerobic and modified atmosphere packaging) at 5°C. These data were analyzed using machine learning and evolutionary computing methods, including partial least square regression (PLS-R), genetic programming (GP), genetic algorithm (GA), artificial neural networks (ANNs) and support vector machines regression (SVR) including different kernel functions [i.e. linear (SVR L), polynomial (SVR P), radial basis (RBF) (SVR R) and sigmoid functions (SVR S)]. Models predictive of the microbiological load and sensory assessment were calculated using these methods and the relative performance compared. In general, it was observed that for both FT-IR and Raman calibration models, better predictions were obtained for TVC, LAB and Enterobacteriaceae, whilst the FT-IR models performed in general slightly better in predicting the microbial counts compared to the Raman models. Additionally, regarding the predictions of the microbial counts the multivariate methods (SVM, PLS) that had similar performances gave better predictions compared to the evolutionary ones (GA-GP, GA-ANN, GP). On the other hand, the GA-GP model performed better from the others in predicting the sensory scores using the FT-IR data, whilst the GA-ANN model performed better in predicting the sensory scores using the Raman data. The results of this study demonstrate for the first time that Raman spectroscopy as well as FT-IR spectroscopy can be used reliably and accurately for the rapid assessment of meat spoilage. © 2012 Elsevier Ltd.

Alexandrov LB, Nik-Zainal S, Wedge DC, Campbell PJ, Stratton MR. 2013. Deciphering signatures of mutational processes operative in human cancer. Cell Rep, 3 (1), pp. 246-259. | Show Abstract | Read more

The genome of a cancer cell carries somatic mutations that are the cumulative consequences of the DNA damage and repair processes operative during the cellular lineage between the fertilized egg and the cancer cell. Remarkably, these mutational processes are poorly characterized. Global sequencing initiatives are yielding catalogs of somatic mutations from thousands of cancers, thus providing the unique opportunity to decipher the signatures of mutational processes operative in human cancer. However, until now there have been no theoretical models describing the signatures of mutational processes operative in cancer genomes and no systematic computational approaches are available to decipher these mutational signatures. Here, by modeling mutational processes as a blind source separation problem, we introduce a computational framework that effectively addresses these questions. Our approach provides a basis for characterizing mutational signatures from cancer-derived somatic mutational catalogs, paving the way to insights into the pathogenetic mechanism underlying all cancers.

Vaughan AA, Dunn WB, Allwood JW, Wedge DC, Blackhall FH, Whetton AD, Dive C, Goodacre R. 2012. Liquid chromatography-mass spectrometry calibration transfer and metabolomics data fusion. Anal Chem, 84 (22), pp. 9848-9857. | Show Abstract | Read more

Metabolic profiling is routinely performed on multiple analytical platforms to increase the coverage of detected metabolites, and it is often necessary to distribute biological and clinical samples from a study between instruments of the same type to share the workload between different laboratories. The ability to combine metabolomics data arising from different sources is therefore of great interest, particularly for large-scale or long-term studies, where samples must be analyzed in separate blocks. This is not a trivial task, however, due to differing data structures, temporal variability, and instrumental drift. In this study, we employed blood serum and plasma samples collected from 29 subjects diagnosed with small cell lung cancer and analyzed each sample on two liquid chromatography-mass spectrometry (LC-MS) platforms. We describe a method for mapping retention times and matching metabolite features between platforms and approaches for fusing data acquired from both instruments. Calibration transfer models were developed and shown to be successful at mapping the response of one LC-MS instrument to another (Procrustes dissimilarity = 0.04; Mantel correlation = 0.95), allowing us to merge the data from different samples analyzed on different instruments. Data fusion was assessed in a clinical context by comparing the correlation of each metabolite with subject survival time in both the original and fused data sets: a simple autoscaling procedure (Pearson's R = 0.99) was found to improve upon a calibration transfer method based on partial least-squares regression (R = 0.94).

McBride DJ, Etemadmoghadam D, Cooke SL, Alsop K, George J, Butler A, Cho J, Galappaththige D, Greenman C, Howarth KD et al. 2012. Tandem duplication of chromosomal segments is common in ovarian and breast cancer genomes. J Pathol, 227 (4), pp. 446-455. | Show Abstract | Read more

The application of paired-end next generation sequencing approaches has made it possible to systematically characterize rearrangements of the cancer genome to base-pair level. Utilizing this approach, we report the first detailed analysis of ovarian cancer rearrangements, comparing high-grade serous and clear cell cancers, and these histotypes with other solid cancers. Somatic rearrangements were systematically characterized in eight high-grade serous and five clear cell ovarian cancer genomes and we report here the identification of > 600 somatic rearrangements. Recurrent rearrangements of the transcriptional regulator gene, TSHZ3, were found in three of eight serous cases. Comparison to breast, pancreatic and prostate cancer genomes revealed that a subset of ovarian cancers share a marked tandem duplication phenotype with triple-negative breast cancers. The tandem duplication phenotype was not linked to BRCA1/2 mutation, suggesting that other common mechanisms or carcinogenic exposures are operative. High-grade serous cancers arising in women with germline BRCA1 or BRCA2 mutation showed a high frequency of small chromosomal deletions. These findings indicate that BRCA1/2 germline mutation may contribute to widespread structural change and that other undefined mechanism(s), which are potentially shared with triple-negative breast cancer, promote tandem chromosomal duplications that sculpt the ovarian cancer genome.

Stephens PJ, Tarpey PS, Davies H, Van Loo P, Greenman C, Wedge DC, Nik-Zainal S, Martin S, Varela I, Bignell GR et al. 2012. The landscape of cancer genes and mutational processes in breast cancer. Nature, 486 (7403), pp. 400-404. | Show Abstract | Read more

All cancers carry somatic mutations in their genomes. A subset, known as driver mutations, confer clonal selective advantage on cancer cells and are causally implicated in oncogenesis, and the remainder are passenger mutations. The driver mutations and mutational processes operative in breast cancer have not yet been comprehensively explored. Here we examine the genomes of 100 tumours for somatic copy number changes and mutations in the coding exons of protein-coding genes. The number of somatic mutations varied markedly between individual tumours. We found strong correlations between mutation number, age at which cancer was diagnosed and cancer histological grade, and observed multiple mutational signatures, including one present in about ten per cent of tumours characterized by numerous mutations of cytosine at TpC dinucleotides. Driver mutations were identified in several new cancer genes including AKT2, ARID1B, CASP8, CDKN1B, MAP3K1, MAP3K13, NCOR1, SMARCD1 and TBX3. Among the 100 tumours, we found driver mutations in at least 40 cancer genes and 73 different combinations of mutated cancer genes. The results highlight the substantial genetic diversity underlying this common disease.

Nik-Zainal S, Alexandrov LB, Wedge DC, Van Loo P, Greenman CD, Raine K, Jones D, Hinton J, Marshall J, Stebbings LA et al. 2012. Mutational processes molding the genomes of 21 breast cancers. Cell, 149 (5), pp. 979-993. | Show Abstract | Read more

All cancers carry somatic mutations. The patterns of mutation in cancer genomes reflect the DNA damage and repair processes to which cancer cells and their precursors have been exposed. To explore these mechanisms further, we generated catalogs of somatic mutation from 21 breast cancers and applied mathematical methods to extract mutational signatures of the underlying processes. Multiple distinct single- and double-nucleotide substitution signatures were discernible. Cancers with BRCA1 or BRCA2 mutations exhibited a characteristic combination of substitution mutation signatures and a distinctive profile of deletions. Complex relationships between somatic mutation prevalence and transcription were detected. A remarkable phenomenon of localized hypermutation, termed "kataegis," was observed. Regions of kataegis differed between cancers but usually colocalized with somatic rearrangements. Base substitutions in these regions were almost exclusively of cytosine at TpC dinucleotides. The mechanisms underlying most of these mutational signatures are unknown. However, a role for the APOBEC family of cytidine deaminases is proposed.

Nik-Zainal S, Van Loo P, Wedge DC, Alexandrov LB, Greenman CD, Lau KW, Raine K, Jones D, Marshall J, Ramakrishna M et al. 2012. The life history of 21 breast cancers. Cell, 149 (5), pp. 994-1007. | Show Abstract | Read more

Cancer evolves dynamically as clonal expansions supersede one another driven by shifting selective pressures, mutational processes, and disrupted cancer genes. These processes mark the genome, such that a cancer's life history is encrypted in the somatic mutations present. We developed algorithms to decipher this narrative and applied them to 21 breast cancers. Mutational processes evolve across a cancer's lifespan, with many emerging late but contributing extensive genetic variation. Subclonal diversification is prominent, and most mutations are found in just a fraction of tumor cells. Every tumor has a dominant subclonal lineage, representing more than 50% of tumor cells. Minimal expansion of these subclones occurs until many hundreds to thousands of mutations have accumulated, implying the existence of long-lived, quiescent cell lineages capable of substantial proliferation upon acquisition of enabling genomic changes. Expansion of the dominant subclone to an appreciable mass may therefore represent the final rate-limiting step in a breast cancer's development, triggering diagnosis.

Van Loo P, Wedge DC, Nik-Zainal S, Stratton MR, Futreal A, Campbell PJ, Breast ICGC. 2012. Exploring the subclonal architecture of breast cancer CANCER RESEARCH, 72 | Read more

Murchison EP, Schulz-Trieglaff OB, Ning Z, Alexandrov LB, Bauer MJ, Fu B, Hims M, Ding Z, Ivakhno S, Stewart C et al. 2012. Genome sequencing and analysis of the Tasmanian devil and its transmissible cancer. Cell, 148 (4), pp. 780-791. | Show Abstract | Read more

The Tasmanian devil (Sarcophilus harrisii), the largest marsupial carnivore, is endangered due to a transmissible facial cancer spread by direct transfer of living cancer cells through biting. Here we describe the sequencing, assembly, and annotation of the Tasmanian devil genome and whole-genome sequences for two geographically distant subclones of the cancer. Genomic analysis suggests that the cancer first arose from a female Tasmanian devil and that the clone has subsequently genetically diverged during its spread across Tasmania. The devil cancer genome contains more than 17,000 somatic base substitution mutations and bears the imprint of a distinct mutational process. Genotyping of somatic mutations in 104 geographically and temporally distributed Tasmanian devil tumors reveals the pattern of evolution and spread of this parasitic clonal lineage, with evidence of a selective sweep in one geographical area and persistence of parallel lineages in other populations.

Brown M, Wedge DC, Goodacre R, Kell DB, Baker PN, Kenny LC, Mamas MA, Neyses L, Dunn WB. 2012. Automated workflows for accurate mass-based putative metabolite identification in LC/MS-derived metabolomic datasets. Bioinformatics (Oxford, England), 28 (1), pp. 149-149. | Read more

Eyers CE, Lawless C, Wedge DC, Lau KW, Gaskell SJ, Hubbard SJ. 2011. CONSeQuence: prediction of reference peptides for absolute quantitative proteomics using consensus machine learning approaches. Mol Cell Proteomics, 10 (11), pp. M110.003384. | Show Abstract | Read more

Mass spectrometric based methods for absolute quantification of proteins, such as QconCAT, rely on internal standards of stable-isotope labeled reference peptides, or "Q-peptides," to act as surrogates. Key to the success of this and related methods for absolute protein quantification (such as AQUA) is selection of the Q-peptide. Here we describe a novel method, CONSeQuence (consensus predictor for Q-peptide sequence), based on four different machine learning approaches for Q-peptide selection. CONSeQuence demonstrates improved performance over existing methods for optimal Q-peptide selection in the absence of prior experimental information, as validated using two independent test sets derived from yeast. Furthermore, we examine the physicochemical parameters associated with good peptide surrogates, and demonstrate that in addition to charge and hydrophobicity, peptide secondary structure plays a significant role in determining peptide "detectability" in liquid chromatography-electrospray ionization experiments. We relate peptide properties to protein tertiary structure, demonstrating a counterintuitive preference for buried status for frequently detected peptides. Finally, we demonstrate the improved efficacy of the general approach by applying a predictor trained on yeast data to sets of proteotypic peptides from two additional species taken from an existing peptide identification repository.

Wedge DC, Allwood JW, Dunn W, Vaughan AA, Simpson K, Brown M, Priest L, Blackhall FH, Whetton AD, Dive C, Goodacre R. 2011. Is serum or plasma more appropriate for intersubject comparisons in metabolomic studies? An assessment in patients with small-cell lung cancer. Anal Chem, 83 (17), pp. 6689-6697. | Show Abstract | Read more

In clinical analyses, the most appropriate biofluid should be analyzed for optimal assay performance. For biological fluids, the most readily accessible is blood, and metabolomic analyses can be performed either on plasma or serum. To determine the optimal agent for analysis, metabolic profiles of matched human serum and plasma were assessed by gas chromatography/time-of-flight mass spectrometry and ultrahigh-performance liquid chromatography mass spectrometry (in positive and negative electrospray ionization modes). Comparison of the two metabolomes, in terms of reproducibility, discriminative ability and coverage, indicated that they offered similar analytical opportunities. An analysis of the variation between 29 small-cell lung cancer (SCLC) patients revealed that the differences between individuals are markedly similar for the two biofluids. However, significant differences between the levels of some specific metabolites were identified, as were differences in the intersubject variability of some metabolite levels. Glycerophosphocholines, erythritol, creatinine, hexadecanoic acid, and glutamine in plasma, but not in serum, were shown to correlate with life expectancy for SCLC patients, indicating the utility of metabolomic analyses in clinical prognosis and the particular utility of plasma in relation to the clinical management of SCLC.

Brown M, Wedge DC, Goodacre R, Kell DB, Baker PN, Kenny LC, Mamas MA, Neyses L, Dunn WB. 2011. Automated workflows for accurate mass-based putative metabolite identification in LC/MS-derived metabolomic datasets. Bioinformatics, 27 (8), pp. 1108-1112. | Show Abstract | Read more

MOTIVATION: The study of metabolites (metabolomics) is increasingly being applied to investigate microbial, plant, environmental and mammalian systems. One of the limiting factors is that of chemically identifying metabolites from mass spectrometric signals present in complex datasets. RESULTS: Three workflows have been developed to allow for the rapid, automated and high-throughput annotation and putative metabolite identification of electrospray LC-MS-derived metabolomic datasets. The collection of workflows are defined as PUTMEDID_LCMS and perform feature annotation, matching of accurate m/z to the accurate mass of neutral molecules and associated molecular formula and matching of the molecular formulae to a reference file of metabolites. The software is independent of the instrument and data pre-processing applied. The number of false positives is reduced by eliminating the inaccurate matching of many artifact, isotope, multiply charged and complex adduct peaks through complex interrogation of experimental data. AVAILABILITY: The workflows, standard operating procedure and further information are publicly available at http://www.mcisb.org/resources/putmedid.html. CONTACT: warwick.dunn@manchester.ac.uk.

Wedge DC, Krishna R, Blackhurst P, Siepen JA, Jones AR, Hubbard SJ. 2011. FDRAnalysis: a tool for the integrated analysis of tandem mass spectrometry identification results from multiple search engines. J Proteome Res, 10 (4), pp. 2088-2094. | Show Abstract | Read more

Confident identification of peptides via tandem mass spectrometry underpins modern high-throughput proteomics. This has motivated considerable recent interest in the postprocessing of search engine results to increase confidence and calculate robust statistical measures, for example through the use of decoy databases to calculate false discovery rates (FDR). FDR-based analyses allow for multiple testing and can assign a single confidence value for both sets and individual peptide spectrum matches (PSMs). We recently developed an algorithm for combining the results from multiple search engines, integrating FDRs for sets of PSMs made by different search engine combinations. Here we describe a web-server and a downloadable application that makes this routinely available to the proteomics community. The web server offers a range of outputs including informative graphics to assess the confidence of the PSMs and any potential biases. The underlying pipeline also provides a basic protein inference step, integrating PSMs into protein ambiguity groups where peptides can be matched to more than one protein. Importantly, we have also implemented full support for the mzIdentML data standard, recently released by the Proteomics Standards Initiative, providing users with the ability to convert native formats to mzIdentML files, which are available to download.

Rowe W, Wedge DC, Platt M, Kell DB, Knowles J. 2010. Predictive models for population performance on real biological fitness landscapes. Bioinformatics, 26 (17), pp. 2145-2152. | Show Abstract | Read more

MOTIVATION: Directed evolution, in addition to its principal application of obtaining novel biomolecules, offers significant potential as a vehicle for obtaining useful information about the topologies of biomolecular fitness landscapes. In this article, we make use of a special type of model of fitness landscapes-based on finite state machines-which can be inferred from directed evolution experiments. Importantly, the model is constructed only from the fitness data and phylogeny, not sequence or structural information, which is often absent. The model, called a landscape state machine (LSM), has already been used successfully in the evolutionary computation literature to model the landscapes of artificial optimization problems. Here, we use the method for the first time to simulate a biological fitness landscape based on experimental evaluation. RESULTS: We demonstrate in this study that LSMs are capable not only of representing the structure of model fitness landscapes such as NK-landscapes, but also the fitness landscape of real DNA oligomers binding to a protein (allophycocyanin), data we derived from experimental evaluations on microarrays. The LSMs prove adept at modelling the progress of evolution as a function of various controlling parameters, as validated by evaluations on the real landscapes. Specifically, the ability of the model to 'predict' optimal mutation rates and other parameters of the evolution is demonstrated. A modification to the standard LSM also proves accurate at predicting the effects of recombination on the evolution.

Rowe W, Platt M, Wedge DC, Day PJ, Kell DB, Knowles J. 2010. Analysis of a complete DNA-protein affinity landscape. J R Soc Interface, 7 (44), pp. 397-408. | Show Abstract | Read more

Properties of biological fitness landscapes are of interest to a wide sector of the life sciences, from ecology to genetics to synthetic biology. For biomolecular fitness landscapes, the information we currently possess comes primarily from two sources: sparse samples obtained from directed evolution experiments; and more fine-grained but less authentic information from 'in silico' models (such as NK-landscapes). Here we present the entire protein-binding profile of all variants of a nucleic acid oligomer 10 bases in length, which we have obtained experimentally by a series of highly parallel on-chip assays. The resulting complete landscape of sequence-binding pairs, comprising more than one million binding measurements in duplicate, has been analysed statistically using a number of metrics commonly applied to synthetic landscapes. These metrics show that the landscape is rugged, with many local optima, and that this arises from a combination of experimental variation and the natural structural properties of the oligonucleotides.

Kettle J, Whitelegg S, Song AM, Wedge DC, Kotacka L, Kolarik V, Madec MB, Yeates SG, Turner ML. 2010. Fabrication of planar organic nanotransistors using low temperature thermal nanoimprint lithography for chemical sensor applications. Nanotechnology, 21 (7), pp. 75301. | Show Abstract | Read more

A new fabrication process for the patterning of organic semiconductors at the nanoscale has been developed using low temperature thermal nanoimprint lithography and the details of this process are discussed. Novel planar nanotransistors have been fabricated and characterized from poly(3-hexylthiophene) (P3HT) and we demonstrate the feasibility of using such devices as highly sensitive chemical sensors.

Rowe W, Platt M, Wedge DC, Day PJR, Kell DB, Knowles JD. 2010. Convergent evolution to an aptamer observed in small populations on DNA microarrays. Phys Biol, 7 (3), pp. 036007. | Show Abstract | Read more

The development of aptamers on custom synthesized DNA microarrays, which has been demonstrated in recent publications, can facilitate detailed analyses of sequence and fitness relationships. Here we use the technique to observe the paths taken through sequence-fitness space by three different evolutionary regimes: asexual reproduction, recombination and model-based evolution. The different evolutionary runs are made on the same array chip in triplicate, each one starting from a small population initialized independently at random. When evolving to a common target protein, glucose-6-phosphate dehydrogenase (G6PD), these nine distinct evolutionary runs are observed to develop aptamers with high affinity and to converge on the same motif not present in any of the starting populations. Regime specific differences in the evolutions, such as speed of convergence, could also be observed.

Cited:

31

WOS

Wedge DC, Das A, Dost R, Kettle J, Madec M-B, Morrison JJ, Grell M, Kell DB, Richardson TH, Yeates S, Turner ML. 2009. Real-time vapour sensing using an OFET-based electronic nose and genetic programming SENSORS AND ACTUATORS B-CHEMICAL, 143 (1), pp. 365-372. | Show Abstract | Read more

Electronic noses (e-noses) are increasingly being used as vapour sensors in a range of application areas. E-noses made up of arrays of organic field-effect transistors (OFETs) are particularly valuable due the range and diversity of the information which they provide concerning analyte binding. This study demonstrates that arrays of OFETs, when combined with a data analysis technique using Genetic Programming (GP), can selectively detect airborne analytes in real time. The use of multiple parameters - on resistance, off current and mobility - collected from multiple transistors coated with different semiconducting polymers gives dramatic improvements in the sensitivity (true positive rate), specificity (true negative rate) and speed of sensing. Computer-controlled data collection allows the identification of analytes in real-time, with a time-lag between exposure and detection of the order of 4 s. © 2009 Elsevier B.V. All rights reserved.

Harding AP, Wedge DC, Popelier PLA. 2009. pK(a) prediction from "Quantum Chemical Topology" descriptors. J Chem Inf Model, 49 (8), pp. 1914-1924. | Show Abstract | Read more

Knowing the pK(a) of a compound gives insight into many properties relevant to many industries, in particular the pharmaceutical industry during drug development processes. In light of this, we have used the theory of Quantum Chemical Topology (QCT), to provide ab initio descriptors that are able to accurately predict pK(a) values for 228 carboxylic acids. This Quantum Topological Molecular Similarity (QTMS) study involved the comparison of 5 increasingly more expensive levels of theory to conclude that HF/6-31G(d) and B3LYP/6-311+G(2d,p) provided an accurate representation of the compounds studies. We created global and subset models for the carboxylic acids using Partial Least Square (PLS), Support Vector Machines (SVM), and Radial Basis Function Neural Networks (RBFNN). The models were extensively validated using 4-, 7-, and 10-fold cross-validation, with the validation sets selected based on systematic and random sampling. HF/6-31G(d) in conjunction with SVM provided the best statistics when taking into account the large increase in CPU time required to optimize the geometries at the B3LYP/6-311+G(2d,p) level. The SVM models provided an average q(2) value of 0.886 and an RMSE value of 0.293 for all the carboxylic acids, a q(2) of 0.825 and RMSE of 0.378 for the ortho-substituted acids, a q(2) of 0.923 and RMSE of 0.112 for the para- and meta-substituted acids, and a q(2) of 0.906 and RMSE of 0.268 for the aliphatic acids. Our method compares favorably to ACD/Laboratories, VCCLAB, SPARC, and ChemAxon's pK(a) prediction software based of the RMSE calculated by the leave-one-out method.

Platt M, Rowe W, Wedge DC, Kell DB, Knowles J, Day PJR. 2009. Aptamer evolution for array-based diagnostics. Anal Biochem, 390 (2), pp. 203-205. | Show Abstract | Read more

Closed loop aptameric directed evolution, (CLADE) is a technique enabling simultaneous discovery, evolution, and optimization of aptamers. It was previously demonstrated using a fluorescent protein, and here we extend its applicability with the generation of surface-bound aptamers for targets containing no natural fluorescence. Starting from a random population, in four generations CLADE produced a new aptamer to thrombin with high specificity and affinity. The best aptameric sequence was void of the set of four guanine repeats typifying thrombin aptamers and, thus, highlights the benefits of evolution performed in an environment closely mimicking the final diagnostic application.

Cited:

24

Scopus

Das A, Dost R, Richardson TH, Grell M, Wedge DC, Kell DB, Morrison JJ, Turner ML. 2009. Low cost, portable, fast multiparameter data acquisition system for organic transistor odour sensors SENSORS AND ACTUATORS B-CHEMICAL, 137 (2), pp. 586-591. | Show Abstract | Read more

We demonstrate a cost-effective but fast multiparameter data acquisition system for odour sensors based on low threshold organic field effect transistors (OFETs) with an amorphous methoxy-derivative of poly(triaryl amine) (PTA-OMe) as semiconductor. The system applies a simple algorithm to measure OFET saturated transfer characteristics with a tailored operational amplifier circuit that is interfaced to a laptop that controls the circuit and analyses data with bespoke software. Despite the semiconductor's low charge carrier mobility μ ∼ 5 × 10-5 Vs/cm2, the system returns multiparameter OFET data: OFET source-drain current ISD in both the 'on' and 'off' state, carrier mobility μ, and threshold (VT), in real time (resolution <1 s). The system is tested by exposing the OFET to a series of alcohol odours at different concentrations. Sensor response is quick, and follows a distinct trend IPA > PrOH > EtOH > MeOH. Crown Copyright © 2009.

Wedge DC, Rowe W, Kell DB, Knowles J. 2009. In silico modelling of directed evolution: Implications for experimental design and stepwise evolution. J Theor Biol, 257 (1), pp. 131-141. | Show Abstract | Read more

We model the process of directed evolution (DE) in silico using genetic algorithms. Making use of the NK fitness landscape model, we analyse the effects of mutation rate, crossover and selection pressure on the performance of DE. A range of values of K, the epistatic interaction of the landscape, are considered, and high- and low-throughput modes of evolution are compared. Our findings suggest that for runs of or around ten generations' duration-as is typical in DE-there is little difference between the way in which DE needs to be configured in the high- and low-throughput regimes, nor across different degrees of landscape epistasis. In all cases, a high selection pressure (but not an extreme one) combined with a moderately high mutation rate works best, while crossover provides some benefit but only on the less rugged landscapes. These genetic algorithms were also compared with a "model-based approach" from the literature, which uses sequential fixing of the problem parameters based on fitting a linear model. Overall, we find that purely evolutionary techniques fare better than do model-based approaches across all but the smoothest landscapes.

Knight CG, Platt M, Rowe W, Wedge DC, Khan F, Day PJR, McShea A, Knowles J, Kell DB. 2009. Array-based evolution of DNA aptamers allows modelling of an explicit sequence-fitness landscape. Nucleic Acids Res, 37 (1), pp. e6. | Show Abstract | Read more

Mapping the landscape of possible macromolecular polymer sequences to their fitness in performing biological functions is a challenge across the biosciences. A paradigm is the case of aptamers, nucleic acids that can be selected to bind particular target molecules. We have characterized the sequence-fitness landscape for aptamers binding allophycocyanin (APC) protein via a novel Closed Loop Aptameric Directed Evolution (CLADE) approach. In contrast to the conventional SELEX methodology, selection and mutation of aptamer sequences was carried out in silico, with explicit fitness assays for 44,131 aptamers of known sequence using DNA microarrays in vitro. We capture the landscape using a predictive machine learning model linking sequence features and function and validate this model using 5500 entirely separate test sequences, which give a very high observed versus predicted correlation of 0.87. This approach reveals a complex sequence-fitness mapping, and hypotheses for the physical basis of aptameric binding; it also enables rapid design of novel aptamers with desired binding properties. We demonstrate an extension to the approach by incorporating prior knowledge into CLADE, resulting in some of the tightest binding sequences.

Wedge D, Ingram D, McLean D, Mingham C, Bandar Z. 2006. On global-local artificial neural networks for function approximation. IEEE Trans Neural Netw, 17 (4), pp. 942-952. | Show Abstract | Read more

We present a hybrid radial basis function (RBF) sigmoid neural network with a three-step training algorithm that utilizes both global search and gradient descent training. The algorithm used is intended to identify global features of an input-output relationship before adding local detail to the approximating function. It aims to achieve efficient function approximation through the separate identification of aspects of a relationship that are expressed universally from those that vary only within particular regions of the input space. We test the effectiveness of our method using five regression tasks; four use synthetic datasets while the last problem uses real-world data on the wave overtopping of seawalls. It is shown that the hybrid architecture is often superior to architectures containing neurons of a single type in several ways: lower mean square errors are often achievable using fewer hidden neurons and with less need for regularization. Our global-local artificial neural network (GL-ANN) is also seen to compare favorably with both perceptron radial basis net and regression tree derived RBFs. A number of issues concerning the training of GL-ANNs are discussed: the use of regularization, the inclusion of a gradient descent optimization step, the choice of RBF spreads, model selection, and the development of appropriate stopping criteria.

Wedge DC, Ingram DM, Mingham CG, McLean DA, Bandar ZA. 2005. Neural network architectures and overtopping predictions PROCEEDINGS OF THE INSTITUTION OF CIVIL ENGINEERS-MARITIME ENGINEERING, 158 (3), pp. 123-133. | Show Abstract | Read more

Overtopping of seawalls presents a considerable hazard to people and property near the coast and accurate predictions of overtopping volumes are essential in informing seawall construction. The methods most commonly used for the prediction of time-averaged overtopping volumes are parametric regression and numerical modelling. In this paper overtopping volumes are predicted using artificial neural networks. This approach is inherently non-parametric and accepts data from a variety of structural configurations and sea-states. Two different types of neural network are considered: multi-layer perceptron networks and radial basis function networks. It was found that the radial basis function networks considerably outperform both the multi-layer perceptron networks and the curve-fitting (parametric regression) regime, and approach bespoke numerical simulations in accuracy. Unlike numerical simulation, the neural network approach gives generic prediction across a range of structures and sea-states and therefore incurs considerably less computational cost.

Tarabichi M, Martincorena I, Gerstung M, Leroi AM, Markowetz F, PCAWG Evolution and Heterogeneity Working Group, Spellman PT, Morris QD, Lingjærde OC, Wedge DC, Van Loo P. 2018. Neutral tumor evolution? Nat Genet, 50 (12), pp. 1630-1633. | Read more

Wedge DC, Gundem G, Mitchell T, Woodcock DJ, Martincorena I, Ghori M, Zamora J, Butler A, Whitaker H, Kote-Jarai Z et al. 2018. Sequencing of prostate cancers identifies new cancer genes, routes of progression and drug targets. Nat Genet, 50 (5), pp. 682-692. | Show Abstract | Read more

Prostate cancer represents a substantial clinical challenge because it is difficult to predict outcome and advanced disease is often fatal. We sequenced the whole genomes of 112 primary and metastatic prostate cancer samples. From joint analysis of these cancers with those from previous studies (930 cancers in total), we found evidence for 22 previously unidentified putative driver genes harboring coding mutations, as well as evidence for NEAT1 and FOXA1 acting as drivers through noncoding mutations. Through the temporal dissection of aberrations, we identified driver mutations specifically associated with steps in the progression of prostate cancer, establishing, for example, loss of CHD1 and BRCA2 as early events in cancer development of ETS fusion-negative cancers. Computational chemogenomic (canSAR) analysis of prostate cancer mutations identified 11 targets of approved drugs, 7 targets of investigational drugs, and 62 targets of compounds that may be active and should be considered candidates for future clinical trials.

Yates LR, Knappskog S, Wedge D, Farmery JHR, Gonzalez S, Martincorena I, Alexandrov LB, Van Loo P, Haugland HK, Lilleng PK et al. 2017. Genomic Evolution of Breast Cancer Metastasis and Relapse. Cancer Cell, 32 (2), pp. 169-184.e7. | Show Abstract | Read more

Patterns of genomic evolution between primary and metastatic breast cancer have not been studied in large numbers, despite patients with metastatic breast cancer having dismal survival. We sequenced whole genomes or a panel of 365 genes on 299 samples from 170 patients with locally relapsed or metastatic breast cancer. Several lines of analysis indicate that clones seeding metastasis or relapse disseminate late from primary tumors, but continue to acquire mutations, mostly accessing the same mutational processes active in the primary tumor. Most distant metastases acquired driver mutations not seen in the primary tumor, drawing from a wider repertoire of cancer genes than early drivers. These include a number of clinically actionable alterations and mutations inactivating SWI-SNF and JAK2-STAT3 pathways.

Yates LR, Gerstung M, Knappskog S, Desmedt C, Gundem G, Van Loo P, Aas T, Alexandrov LB, Larsimont D, Davies H et al. 2015. Subclonal diversification of primary breast cancer revealed by multiregion sequencing. Nat Med, 21 (7), pp. 751-759. | Show Abstract | Read more

The sequencing of cancer genomes may enable tailoring of therapeutics to the underlying biological abnormalities driving a particular patient's tumor. However, sequencing-based strategies rely heavily on representative sampling of tumors. To understand the subclonal structure of primary breast cancer, we applied whole-genome and targeted sequencing to multiple samples from each of 50 patients' tumors (303 samples in total). The extent of subclonal diversification varied among cases and followed spatial patterns. No strict temporal order was evident, with point mutations and rearrangements affecting the most common breast cancer genes, including PIK3CA, TP53, PTEN, BRCA2 and MYC, occurring early in some tumors and late in others. In 13 out of 50 cancers, potentially targetable mutations were subclonal. Landmarks of disease progression, such as resistance to chemotherapy and the acquisition of invasive or metastatic potential, arose within detectable subclones of antecedent lesions. These findings highlight the importance of including analyses of subclonal structure and tumor evolution in clinical trials of primary breast cancer.

Gundem G, Van Loo P, Kremeyer B, Alexandrov LB, Tubio JMC, Papaemmanuil E, Brewer DS, Kallio HML, Högnäs G, Annala M et al. 2015. The evolutionary history of lethal metastatic prostate cancer. Nature, 520 (7547), pp. 353-357. | Show Abstract | Read more

Cancers emerge from an ongoing Darwinian evolutionary process, often leading to multiple competing subclones within a single primary tumour. This evolutionary process culminates in the formation of metastases, which is the cause of 90% of cancer-related deaths. However, despite its clinical importance, little is known about the principles governing the dissemination of cancer cells to distant organs. Although the hypothesis that each metastasis originates from a single tumour cell is generally supported, recent studies using mouse models of cancer demonstrated the existence of polyclonal seeding from and interclonal cooperation between multiple subclones. Here we sought definitive evidence for the existence of polyclonal seeding in human malignancy and to establish the clonal relationship among different metastases in the context of androgen-deprived metastatic prostate cancer. Using whole-genome sequencing, we characterized multiple metastases arising from prostate tumours in ten patients. Integrated analyses of subclonal architecture revealed the patterns of metastatic spread in unprecedented detail. Metastasis-to-metastasis spread was found to be common, either through de novo monoclonal seeding of daughter metastases or, in five cases, through the transfer of multiple tumour clones between metastatic sites. Lesions affecting tumour suppressor genes usually occur as single events, whereas mutations in genes involved in androgen receptor signalling commonly involve multiple, convergent events in different metastases. Our results elucidate in detail the complex patterns of metastatic spread and further our understanding of the development of resistance to androgen-deprivation therapy in prostate cancer.

Cooper CS, Eeles R, Wedge DC, Van Loo P, Gundem G, Alexandrov LB, Kremeyer B, Butler A, Lynch AG, Camacho N et al. 2015. Analysis of the genetic phylogeny of multifocal prostate cancer identifies multiple independent clonal expansions in neoplastic and morphologically normal prostate tissue. Nat Genet, 47 (4), pp. 367-372. | Show Abstract | Read more

Genome-wide DNA sequencing was used to decrypt the phylogeny of multiple samples from distinct areas of cancer and morphologically normal tissue taken from the prostates of three men. Mutations were present at high levels in morphologically normal tissue distant from the cancer, reflecting clonal expansions, and the underlying mutational processes at work in morphologically normal tissue were also at work in cancer. Our observations demonstrate the existence of ongoing abnormal mutational processes, consistent with field effects, underlying carcinogenesis. This mechanism gives rise to extensive branching evolution and cancer clone mixing, as exemplified by the coexistence of multiple cancer lineages harboring distinct ERG fusions within a single cancer nodule. Subsets of mutations were shared either by morphologically normal and malignant tissues or between different ERG lineages, indicating earlier or separate clonal cell expansions. Our observations inform on the origin of multifocal disease and have implications for prostate cancer therapy in individual cases.

Murchison EP, Wedge DC, Alexandrov LB, Fu B, Martincorena I, Ning Z, Tubio JMC, Werner EI, Allen J, De Nardi AB et al. 2014. Transmissible [corrected] dog cancer genome reveals the origin and history of an ancient cell lineage. Science, 343 (6169), pp. 437-440. | Show Abstract | Read more

Canine transmissible venereal tumor (CTVT) is the oldest known somatic cell lineage. It is a transmissible cancer that propagates naturally in dogs. We sequenced the genomes of two CTVT tumors and found that CTVT has acquired 1.9 million somatic substitution mutations and bears evidence of exposure to ultraviolet light. CTVT is remarkably stable and lacks subclonal heterogeneity despite thousands of rearrangements, copy-number changes, and retrotransposon insertions. More than 10,000 genes carry nonsynonymous variants, and 646 genes have been lost. CTVT first arose in a dog with low genomic heterozygosity that may have lived about 11,000 years ago. The cancer spawned by this individual dispersed across continents about 500 years ago. Our results provide a genetic identikit of an ancient dog and demonstrate the robustness of mammalian somatic cells to survive for millennia despite a massive mutation burden.

Bolli N, Avet-Loiseau H, Wedge DC, Van Loo P, Alexandrov LB, Martincorena I, Dawson KJ, Iorio F, Nik-Zainal S, Bignell GR et al. 2014. Heterogeneity of genomic evolution and mutational profiles in multiple myeloma. Nat Commun, 5 (1), pp. 2997. | Show Abstract | Read more

Multiple myeloma is an incurable plasma cell malignancy with a complex and incompletely understood molecular pathogenesis. Here we use whole-exome sequencing, copy-number profiling and cytogenetics to analyse 84 myeloma samples. Most cases have a complex subclonal structure and show clusters of subclonal variants, including subclonal driver mutations. Serial sampling reveals diverse patterns of clonal evolution, including linear evolution, differential clonal response and branching evolution. Diverse processes contribute to the mutational repertoire, including kataegis and somatic hypermutation, and their relative contribution changes over time. We find heterogeneity of mutational spectrum across samples, with few recurrent genes. We identify new candidate genes, including truncations of SP140, LTB, ROBO1 and clustered missense mutations in EGR1. The myeloma genome is heterogeneous across the cohort, and exhibits diversity in clonal admixture and in dynamics of evolution, which may impact prognostic stratification, therapeutic approaches and assessment of disease response to treatment.

Alexandrov LB, Nik-Zainal S, Wedge DC, Aparicio SAJR, Behjati S, Biankin AV, Bignell GR, Bolli N, Borg A, Børresen-Dale A-L et al. 2013. Signatures of mutational processes in human cancer. Nature, 500 (7463), pp. 415-421. | Show Abstract | Read more

All cancers are caused by somatic mutations; however, understanding of the biological processes generating these mutations is limited. The catalogue of somatic mutations from a cancer genome bears the signatures of the mutational processes that have been operative. Here we analysed 4,938,362 mutations from 7,042 cancers and extracted more than 20 distinct mutational signatures. Some are present in many cancer types, notably a signature attributed to the APOBEC family of cytidine deaminases, whereas others are confined to a single cancer class. Certain signatures are associated with age of the patient at cancer diagnosis, known mutagenic exposures or defects in DNA maintenance, but many are of cryptic origin. In addition to these genome-wide mutational signatures, hypermutation localized to small genomic regions, 'kataegis', is found in many cancer types. The results reveal the diversity of mutational processes underlying the development of cancer, with potential implications for understanding of cancer aetiology, prevention and therapy.

Alexandrov LB, Nik-Zainal S, Wedge DC, Campbell PJ, Stratton MR. 2013. Deciphering signatures of mutational processes operative in human cancer. Cell Rep, 3 (1), pp. 246-259. | Show Abstract | Read more

The genome of a cancer cell carries somatic mutations that are the cumulative consequences of the DNA damage and repair processes operative during the cellular lineage between the fertilized egg and the cancer cell. Remarkably, these mutational processes are poorly characterized. Global sequencing initiatives are yielding catalogs of somatic mutations from thousands of cancers, thus providing the unique opportunity to decipher the signatures of mutational processes operative in human cancer. However, until now there have been no theoretical models describing the signatures of mutational processes operative in cancer genomes and no systematic computational approaches are available to decipher these mutational signatures. Here, by modeling mutational processes as a blind source separation problem, we introduce a computational framework that effectively addresses these questions. Our approach provides a basis for characterizing mutational signatures from cancer-derived somatic mutational catalogs, paving the way to insights into the pathogenetic mechanism underlying all cancers.

Nik-Zainal S, Van Loo P, Wedge DC, Alexandrov LB, Greenman CD, Lau KW, Raine K, Jones D, Marshall J, Ramakrishna M et al. 2012. The life history of 21 breast cancers. Cell, 149 (5), pp. 994-1007. | Show Abstract | Read more

Cancer evolves dynamically as clonal expansions supersede one another driven by shifting selective pressures, mutational processes, and disrupted cancer genes. These processes mark the genome, such that a cancer's life history is encrypted in the somatic mutations present. We developed algorithms to decipher this narrative and applied them to 21 breast cancers. Mutational processes evolve across a cancer's lifespan, with many emerging late but contributing extensive genetic variation. Subclonal diversification is prominent, and most mutations are found in just a fraction of tumor cells. Every tumor has a dominant subclonal lineage, representing more than 50% of tumor cells. Minimal expansion of these subclones occurs until many hundreds to thousands of mutations have accumulated, implying the existence of long-lived, quiescent cell lineages capable of substantial proliferation upon acquisition of enabling genomic changes. Expansion of the dominant subclone to an appreciable mass may therefore represent the final rate-limiting step in a breast cancer's development, triggering diagnosis.

Wedge D, Ingram D, McLean D, Mingham C, Bandar Z. 2006. On global-local artificial neural networks for function approximation. IEEE Trans Neural Netw, 17 (4), pp. 942-952. | Show Abstract | Read more

We present a hybrid radial basis function (RBF) sigmoid neural network with a three-step training algorithm that utilizes both global search and gradient descent training. The algorithm used is intended to identify global features of an input-output relationship before adding local detail to the approximating function. It aims to achieve efficient function approximation through the separate identification of aspects of a relationship that are expressed universally from those that vary only within particular regions of the input space. We test the effectiveness of our method using five regression tasks; four use synthetic datasets while the last problem uses real-world data on the wave overtopping of seawalls. It is shown that the hybrid architecture is often superior to architectures containing neurons of a single type in several ways: lower mean square errors are often achievable using fewer hidden neurons and with less need for regularization. Our global-local artificial neural network (GL-ANN) is also seen to compare favorably with both perceptron radial basis net and regression tree derived RBFs. A number of issues concerning the training of GL-ANNs are discussed: the use of regularization, the inclusion of a gradient descent optimization step, the choice of RBF spreads, model selection, and the development of appropriate stopping criteria.

3047

Thank you for registering your interest

We were unable to record your request to register for interest in future opportunities. Please try again and if problems persist contact us at webteam@ndm.ox.ac.uk