Marco Gustav, Marko van Treeck, Nic G. Reitsam, Zunamys I. Carrero, Chiara M. L. Loeffler, Asier Rabasco Meneghetti, Bruno Märkl, Lisa A. Boardman, Amy J. French, Ellen L. Goode, Andrea Gsur, Stefanie Brezina, Marc J. Gunter, Neil Murphy, Pia Hönscheid, Christian Sperling, Sebastian Foersch, Robert Steinfelder, Tabitha Harrison, Ulrike Peters, Amanda Phipps, Jakob Nikolas Kather
- Background
Deep learning-based models enable the prediction of molecular biomarkers from histopathology slides of colorectal cancer stained with haematoxylin and eosin; however, few studies have assessed prediction targets beyond microsatellite instability (MSI), BRAF, and KRAS systematically. We aimed to develop and validate a multi-target model based on deep learning for the simultaneous prediction of numerous genetic alterations and their associated phenotypes in colorectal cancer.
Methods
In this multicentre cohort study, tissue samples from patients with colorectal cancer were obtained by surgical resection and stained with haematoxylin and eosin. These samples were then digitised into whole-slide images and used to train and test a transformer-based deep learning algorithm for biomarker detection to simultaneously predict multiple genetic alterations and provide heatmap explanations. The primary dataset comprised 1376 patients from five cohorts who underwent comprehensiveBackground
Deep learning-based models enable the prediction of molecular biomarkers from histopathology slides of colorectal cancer stained with haematoxylin and eosin; however, few studies have assessed prediction targets beyond microsatellite instability (MSI), BRAF, and KRAS systematically. We aimed to develop and validate a multi-target model based on deep learning for the simultaneous prediction of numerous genetic alterations and their associated phenotypes in colorectal cancer.
Methods
In this multicentre cohort study, tissue samples from patients with colorectal cancer were obtained by surgical resection and stained with haematoxylin and eosin. These samples were then digitised into whole-slide images and used to train and test a transformer-based deep learning algorithm for biomarker detection to simultaneously predict multiple genetic alterations and provide heatmap explanations. The primary dataset comprised 1376 patients from five cohorts who underwent comprehensive panel sequencing, with an additional 536 patients from two public datasets for validation. We compared the model's performance against conventional single-target models and examined the co-occurrence of alterations and shared morphology.
Findings
The multi-target model was able to predict numerous biomarkers from pathology slides, matching and partly exceeding single-target transformers. In the primary external validation cohorts, mean area under the receiver operating characteristic curve (AUROC) for the multi-target transformer was 0·78 (SD 0·01) for BRAF, 0·88 (0·01) for hypermutation, 0·93 (0·01) for MSI, and 0·86 (0·01) for RNF43; predictive performance was consistent across metrics and supported by co-occurrence analyses. However, biomarkers with high AUROCs largely correlated with MSI, with model predictions depending considerably on morphology associated with MSI at pathological examination.
Interpretation
By use of morphology associated with MSI and more subtle biomarker-specific patterns within a shared phenotype, the multi-target transformers efficiently predicted biomarker status for diverse genetic alterations in colorectal cancer from slides stained with haematoxylin and eosin. These results highlight the importance of considering mutational co-occurrence and common morphology in biomarker research based on deep learning. Our validated and scalable model could support extension to other cancers and large, diverse cohorts, potentially facilitating cost-effective pre-screening and streamlined diagnostics in precision oncology.…

