We employ a computational framework that integrates mathematical programming and Graph Neural Networks (GNNs) to elucidate functional phenotypic heterogeneity in disease by classifying entire pathways under various conditions of interest. Our approach combines two distinct, yet seamlessly integrated, modeling schemes. First, we leverage Prior Knowledge Networks (PKNs) to reconstruct gene networks from genomic and transcriptomic data. We demonstrate how this can be achieved through mathematical programming optimization and provide examples using comprehensive, established databases. We then tailor GNNs to classify each network as a single data point at graph-level, using various node embeddings and edge attributes. These networks may vary in their biological or molecular annotations, which serve as a labeling scheme for their supervised classification. We apply the framework to the human DNA damage and repair pathway using the TP53 regulon in a pancancer study across cell lines and tumor samples to classify Gene Regulatory Networks (GRNs) across different TP53 mutation types. This approach allows us to identify mutations with distinguishable functional profiles that can be related to specific phenotypes, thus providing a data-driven pipeline for genotype-to-phenotype translation. This scalable approach enables the classification of diverse conditions within the multi-factorial nature of diseases and disentangles their polygenic complexity by revealing new functional patterns through a causal representation.
Journal article
2025-08-01T00:00:00+00:00
11
Nuffield Department of Medicine, University of Oxford, Oxford, UK. harry.triantafyllidis@ndm.ox.ac.uk.
Humans, DNA Damage, Computational Biology, Genomics, Causality, Phenotype, Mutation, Algorithms, Tumor Suppressor Protein p53, Gene Regulatory Networks, Neural Networks, Computer, Graph Neural Networks