scholarly journals ParsVNN: parsimony visible neural networks for uncovering cancer-specific and drug-sensitive genes and pathways

2021 ◽  
Vol 3 (4) ◽  
Author(s):  
Xiaoqing Huang ◽  
Kun Huang ◽  
Travis Johnson ◽  
Milan Radovich ◽  
Jie Zhang ◽  
...  

Abstract Prediction of cancer-specific drug responses as well as identification of the corresponding drug-sensitive genes and pathways remains a major biological and clinical challenge. Deep learning models hold immense promise for better drug response predictions, but most of them cannot provide biological and clinical interpretability. Visible neural network (VNN) models have emerged to solve the problem by giving neurons biological meanings and directly casting biological networks into the models. However, the biological networks used in VNNs are often redundant and contain components that are irrelevant to the downstream predictions. Therefore, the VNNs using these redundant biological networks are overparameterized, which significantly limits VNNs’ predictive and explanatory power. To overcome the problem, we treat the edges and nodes in biological networks used in VNNs as features and develop a sparse learning framework ParsVNN to learn parsimony VNNs with only edges and nodes that contribute the most to the prediction task. We applied ParsVNN to build cancer-specific VNN models to predict drug response for five different cancer types. We demonstrated that the parsimony VNNs built by ParsVNN are superior to other state-of-the-art methods in terms of prediction performance and identification of cancer driver genes. Furthermore, we found that the pathways selected by ParsVNN have great potential to predict clinical outcomes as well as recommend synergistic drug combinations.

2018 ◽  
Author(s):  
Olivier Collier ◽  
Véronique Stoven ◽  
Jean-Philippe Vert

AbstractCancer driver genes, i.e., oncogenes and tumor suppressor genes, are involved in the acquisition of important functions in tumors, providing a selective growth advantage, allowing uncontrolled proliferation and avoiding apoptosis. It is therefore important to identify these driver genes, both for the fundamental understanding of cancer and to help finding new therapeutic targets. Although the most frequently mutated driver genes have been identified, it is believed that many more remain to be discovered, particularly for driver genes specific to some cancer types.In this paper we propose a new computational method called LOTUS to predict new driver genes. LOTUS is a machine-learning based approach which allows to integrate various types of data in a versatile manner, including informations about gene mutations and protein-protein interactions. In addition, LOTUS can predict cancer driver genes in a pan-cancer setting as well as for specific cancer types, using a multitask learning strategy to share information across cancer types.We empirically show that LOTUS outperforms three other state-of-the-art driver gene prediction methods, both in terms of intrinsic consistency and prediction accuracy, and provide predictions of new cancer genes across many cancer types.Author summaryCancer development is driven by mutations and dysfunction of important, so-called cancer driver genes, that could be targeted by targeted therapies. While a number of such cancer genes have already been identified, it is believed that many more remain to be discovered. To help prioritize experimental investigations of candidate genes, several computational methods have been proposed to rank promising candidates based on their mutations in large cohorts of cancer cases, or on their interactions with known driver genes in biological networks. We propose LOTUS, a new computational approach to identify genes with high oncogenic potential. LOTUS implements a machine learning approach to learn an oncogenic potential score from known driver genes, and brings two novelties compared to existing methods. First, it allows to easily combine heterogeneous informations into the scoring function, which we illustrate by learning a scoring function from both known mutations in large cancer cohorts and interactions in biological networks. Second, using a multitask learning strategy, it can predict different driver genes for different cancer types, while sharing information between them to improve the prediction for every type. We provide experimental results showing that LOTUS significantly outperforms several state-of-the-art cancer gene prediction softwares.


2020 ◽  
Vol 49 (D1) ◽  
pp. D1289-D1301 ◽  
Author(s):  
Tao Wang ◽  
Shasha Ruan ◽  
Xiaolu Zhao ◽  
Xiaohui Shi ◽  
Huajing Teng ◽  
...  

Abstract The prevalence of neutral mutations in cancer cell population impedes the distinguishing of cancer-causing driver mutations from passenger mutations. To systematically prioritize the oncogenic ability of somatic mutations and cancer genes, we constructed a useful platform, OncoVar (https://oncovar.org/), which employed published bioinformatics algorithms and incorporated known driver events to identify driver mutations and driver genes. We identified 20 162 cancer driver mutations, 814 driver genes and 2360 pathogenic pathways with high-confidence by reanalyzing 10 769 exomes from 33 cancer types in The Cancer Genome Atlas (TCGA) and 1942 genomes from 18 cancer types in International Cancer Genome Consortium (ICGC). OncoVar provides four points of view, ‘Mutation’, ‘Gene’, ‘Pathway’ and ‘Cancer’, to help researchers to visualize the relationships between cancers and driver variants. Importantly, identification of actionable driver alterations provides promising druggable targets and repurposing opportunities of combinational therapies. OncoVar provides a user-friendly interface for browsing, searching and downloading somatic driver mutations, driver genes and pathogenic pathways in various cancer types. This platform will facilitate the identification of cancer drivers across individual cancer cohorts and helps to rank mutations or genes for better decision-making among clinical oncologists, cancer researchers and the broad scientific community interested in cancer precision medicine.


2017 ◽  
Author(s):  
Luis Zapata ◽  
Hana Susak ◽  
Oliver Drechsel ◽  
Marc R. Friedländer ◽  
Xavier Estivill ◽  
...  

AbstractTumors are composed of an evolving population of cells subjected to tissue-specific selection, which fuels tumor heterogeneity and ultimately complicates cancer driver gene identification. Here, we integrate cancer cell fraction, population recurrence, and functional impact of somatic mutations as signatures of selection into a Bayesian inference model for driver prediction. In an in-depth benchmark, we demonstrate that our model, cDriver, outperforms competing methods when analyzing solid tumors, hematological malignancies, and pan-cancer datasets. Applying cDriver to exome sequencing data of 21 cancer types from 6,870 individuals revealed 98 unreported tumor type-driver gene connections. These novel connections are highly enriched for chromatin-modifying proteins, hinting at a universal role of chromatin regulation in cancer etiology. Although infrequently mutated as single genes, we show that chromatin modifiers are altered in a large fraction of cancer patients. In summary, we demonstrate that integration of evolutionary signatures is key for identifying mutational driver genes, thereby facilitating the discovery of novel therapeutic targets for cancer treatment.


Author(s):  
Joo Sang Lee ◽  
Nishanth Ulhas Nair ◽  
Lesley Chapman ◽  
Sanju Sinha ◽  
Kun Wang ◽  
...  

AbstractPrecision oncology has made significant advances in the last few years, mainly by targeting actionable mutations in cancer driver genes. However, the proportion of patients whose tumors can be targeted therapeutically remains limited. Recent studies have begun to explore the benefit of analyzing tumor transcriptomics data to guide patient treatment, raising the need for new approaches for systematically accomplishing that. Here we show that computationally derived genetic interactions can successfully predict patient response. Assembling a broad repertoire of 32 datasets spanning more than 1,500 patients and including both tumor transcriptomics and response data, we predicted the response in 17 out of 21 targeted and 8 out of 11 checkpoint therapy datasets across 8 different cancer types with considerable accuracy, without ever training on these datasets. Analyzing the recently published multi-arm WINTHER trial, we show that the fraction of patients benefitting from transcriptomic-based treatments could potentially be markedly increased from 15% to about 85% by targeting synthetic lethal vulnerabilities in their tumors. In summary, this is the first computational approach to obtain considerable predictive performance across many different targeted and immunotherapy datasets, providing a promising new way for guiding cancer treatment based on the tumor transcriptomics of cancer patients.


2017 ◽  
Vol 46 (D1) ◽  
pp. D1027-D1030 ◽  
Author(s):  
Xin Feng ◽  
Lei Li ◽  
Eric J Wagner ◽  
Wei Li

AbstractWidespread alternative polyadenylation (APA) occurs during enhanced cellular proliferation and transformation. Recently, we demonstrated that CFIm25-mediated 3′ UTR shortening through APA promotes glioblastoma tumor growth in vitro and in vivo, further underscoring its significance to tumorigenesis. Here, we report The Cancer 3′ UTR Atlas (TC3A), a comprehensive resource of APA usage for 10,537 tumors across 32 cancer types. These APA events represent potentially novel prognostic biomarkers and may uncover novel mechanisms for the regulation of cancer driver genes. TC3A is built on top of the now de facto standard cBioPortal. Therefore, the large community of existing cBioPortal users and clinical researchers will find TC3A familiar and immediately usable. TC3A is currently fully functional and freely available at http://tc3a.org.


2021 ◽  
Author(s):  
Chenye Wang ◽  
Junhan Shi ◽  
Jiansheng Cai ◽  
Yusen Zhang ◽  
Xiaoqi Zheng ◽  
...  

Abstract Background: Recent advances in next-generation sequencing technologies have helped investigators generate massive amounts of cancer genomic data. A critical challenge in cancer genomics is identification of a few driver mutation genes from a much larger number of passenger mutation genes. However, majority of existing computational approaches underuse the co-occurrence information of the individuals, which deems to be important in tumorigenesis and tumor progression. Driver gene list predicted from these tools are prone to be false positive, recent research is far from achieving the ultimate goal of discovering a complete catalog of driver genes. Results: To make full use of co-mutation information, we present a random walk algorithm referred to as DriverRWH on a weighted gene mutation hypergraph model, using somatic mutation data and molecular interaction network data to prioritize candidate driver genes. Applied to tumor samples of different cancer types from The Cancer Genome Atlas (TCGA), DriverRWH shows significantly better performance than state-of-art prioritization methods in terms of the area under the curve (AUC) scores and the cumulative number of known driver genes recovered in top-ranked candidate genes. DriverRWH recovers approximately 50% known driver genes in the top 30 ranked candidate genes for more than half of the cancer types. In addition, DriverRWH is also highly robust to perturbations in the mutation data and gene functional network data. Conclusion: DriverRWH is effective among various cancer types in prioritizes cancer driver genes and provides considerable improvement over other tools with a better balance of precision and sensitivity. It can be a useful tool for detecting potential driver genes and facilitate targeted cancer therapies.


2021 ◽  
Author(s):  
Langyu Gu ◽  
Guofen Yang

Cancer is one of the most threatening diseases to humans. Understanding the evolution of cancer genes is helpful for therapy management. However, systematic investigation of the evolution of cancer driver genes is sparse. Using comparative genomic analysis, population genetics analysis and computational molecular evolutionary analysis, we detected the evolution of 568 cancer driver genes of 66 cancer types across the primate phylogeny (long timescale selection), and in modern human populations from the 1000 human genomics project (recent selection). We found that recent selection pressures, rather than long timescale selection, significantly affect the evolution of cancer driver genes in humans. Cancer driver genes related to morphological traits and local adaptation are under positive selection in different human populations. The African population showed the largest extent of divergence compared to other populations. It is worth noting that the corresponding cancer types of positively selected genes exhibited population-specific patterns, with the South Asian population possessing the least numbers of cancer types. This helps explain why the South Asian population usually has low cancer incidence rates. Population-specific patterns of cancer types whose driver genes are under positive selection also give clues to explain discrepancies of cancer incidence rates in different geographical populations, such as the high incidence rate of Wilms tumour in the African population and of Ewing's sarcomas in the European population. Our findings are thus helpful for understanding cancer evolution and providing guidance for further precision medicine.


2019 ◽  
Author(s):  
Pramod Chandrashekar ◽  
Navid Ahmadinejad ◽  
Junwen Wang ◽  
Aleksandar Sekulic ◽  
Jan B. Egan ◽  
...  

ABSTRACTFunctions of cancer driver genes depend on cellular contexts that vary substantially across tissues and organs. Distinguishing oncogenes (OGs) and tumor suppressor genes (TSGs) for each cancer type is critical to identifying clinically actionable targets. However, current resources for context-aware classifications of cancer drivers are limited. In this study, we show that the direction and magnitude of somatic selection of missense and truncating mutations of a gene are suggestive of its contextual activities. By integrating these features with ratiometric and conservation measures, we developed a computational method to categorize OGs and TSGs using exome sequencing data. This new method, named genes under selection in tumors (GUST) shows an overall accuracy of 0.94 when tested on manually curated benchmarks. Application of GUST to 10,172 tumor exomes of 33 cancer types identified 98 OGs and 179 TSGs, >70% of which promote tumorigenesis in only one cancer type. In broad-spectrum drivers shared across multiple cancer types, we found heterogeneous mutational hotspots modifying distinct functional domains, implicating the synchrony of convergent and divergent disease mechanisms. We further discovered two novel OGs and 28 novel TSGs with high confidence. The GUST program is available at https://github.com/liliulab/gust. A database with pre-computed classifications is available at https://liliulab.shinyapps.io/gust


2018 ◽  
Author(s):  
Sam F. L. Windels ◽  
Noël Malod-Dognin ◽  
Nataša Pržulj

AbstractMotivationLaplacian matrices capture the global structure of networks and are widely used to study biological networks. However, the local structure of the network around a node can also capture biological information. Local wiring patterns are typically quantified by counting how often a node touches different graphlets (small, connected, induced sub-graphs). Currently available graphlet-based methods do not consider whether nodes are in the same network neighbourhood.ContributionTo combine graphlet-based topological information and membership of nodes to the same network neighbourhood, we generalize the Laplacian to the Graphlet Laplacian, by considering a pair of nodes to be ‘adjacent’ if they simultaneously touch a given graphlet.ResultsWe utilize Graphlet Laplacians to generalize spectral embedding, spectral clustering and network diffusion. Applying our generalization of spectral clustering to model networks and biological networks shows that Graphlet Laplacians capture different local topology corresponding to the underlying graphlet. In biological networks, clusters obtained by using different Graphlet Laplacians capture complementary sets of biological functions. By diffusing pan-cancer gene mutation scores based on different Graphlet Laplacians, we find complementary sets of cancer driver genes. Hence, we demonstrate that Graphlet Laplacians capture topology-function and topology-disease relationships in biological networks


2019 ◽  
Author(s):  
Rafsan Ahmed ◽  
Ilyes Baali ◽  
Cesim Erten ◽  
Evis Hoxha ◽  
Hilal Kazan

AbstractMotivationGenomic analyses from large cancer cohorts have revealed the mutational heterogeneity problem which hinders the identification of driver genes based only on mutation profiles. One way to tackle this problem is to incorporate the fact that genes act together in functional modules. The connectivity knowledge present in existing protein-protein interaction networks together with mutation frequencies of genes and the mutual exclusivity of cancer mutations can be utilized to increase the accuracy of identifying cancer driver modules.ResultsWe present a novel edge-weighted random walk-based approach that incorporates connectivity information in the form of protein-protein interactions, mutual exclusion, and coverage to identify cancer driver modules. MEXCOWalk outperforms several state-of-the-art computational methods on TCGA pan-cancer data in terms of recovering known cancer genes, providing modules that are capable of classifying normal and tumor samples, and that are enriched for mutations in specific cancer types. Furthermore, the risk scores determined with output modules can stratify patients into low-risk and high-risk groups in multiple cancer types. MEXCOwalk identifies modules containing both well-known cancer genes and putative cancer genes that are rarely mutated in the pan-cancer data. The data, the source code, and useful scripts are available at:https://github.com/abu-compbio/[email protected]


Sign in / Sign up

Export Citation Format

Share Document