Refine
Document Type
- Preprint (2)
- Article (1)
- Conference Proceeding (1)
Language
- English (4)
Keywords
- Genetics (1)
- Genetics(clinical) (1)
- Molecular Biology (1)
- Molecular Medicine (1)
Motivation Colorectal Cancer has the second-highest mortality rate worldwide, which requires advanced diagnostics and individualized therapies to be developed. Information about the interactions between molecular entities provides valuable information to detect the responsible genes driving cancer progression. Graph Convolutional Neural Networks are able to utilize the prior knowledge provided by interaction networks and the Spektral library adds a performance increase in contrast to standard implementations. Furthermore, machine learning technology shows great potential to assist medical professionals through guided clinical decision support. However, the deep learning models are limited in their application in precision medicine due to their lack to explain the factors contributing to a prediction. Adaption of the Graph Layer-Wise Relevance Propagation methodology to graph-based deep learning models allows to attribute the learned outcome to single genes and determine their relevance. The resulting patient-specific subnetworks then can be used to identify potentially targetable genes.
Results We present an implementation of Graph Convolutional Neural Networks using the Spektral library in combination with adapted functions for Graph Layer-Wise Relevance Propagation. Deep learning models were trained on a newly composed large gene expression dataset of Colorectal Cancer patients with different molecular interaction networks as prior knowledge: Protein-protein interactions from the Human Protein Reference Database and STRING, and pathways from the Reactome database. Our implementation performs comparably with the original implementation while reducing the computation time, especially for large networks. Further, the generated subnetworks are similar to those of the initial implementation and reveal possible, and even more distant, biomarkers and drug targets.
Availability The implementation details and corresponding dataset including their visualizations can be found at https://github.com/frankkramer-lab/spektral-gcnn-glrp-on-crc-data
Motivation Collaborative workflows in network biology not only require the documentation of the performed analysis steps but also of the network data on which the decisions were based. However, replication of the entire workflow or tracking of the intermediate networks used for a particular visualization remains an intricate task. Also, the amount and heterogeneity of the integrated data requires instruments to explore and thus comprehend the results.
Results Here we demonstrate a collection of software tools and libraries for network data integration, exploration, and visualization to document the different stages of the workflow. The integrative steps are performed in R, and the entire process is accompanied by an interchangeable toolset for data exploration and network visualization.
Availability The source code of the performed workflow is available as R markdown scripts at https://github.com/frankkramer-lab/reproducible-network-visualization. A compiled HTML version is also hosted on Github pages at https://frankkramer-lab.github.io/reproducible-network-visualization.
Gene expression data is commonly available in cancer research and provides a snapshot of the molecular status of a specific tumor tissue. This high-dimensional data can be analyzed for diagnoses, prognoses, and to suggest treatment options. Machine learning based methods are widely used for such analysis. Recently, a set of deep learning techniques was successfully applied in different domains including bioinformatics. One of these prominent techniques are convolutional neural networks (CNN). Currently, CNNs are extending to non-Euclidean domains like graphs. Molecular networks are commonly represented as graphs detailing interactions between molecules. Gene expression data can be assigned to the vertices of these graphs, and the edges can depict interactions, regulations and signal flow. In other words, gene expression data can be structured by utilizing molecular network information as prior knowledge. Here, we applied graph CNN to gene expression data of breast cancer patients to predict the occurrence of metastatic events. To structure the data we utilized a protein-protein interaction network. We show that the graph CNN exploiting the prior knowledge is able to provide classification improvements for the prediction of metastatic events compared to existing methods.