IE-Vnet: deep learning-based segmentation of the inner ear's total fluid space
- Background In-vivo MR-based high-resolution volumetric quantification methods of the endolymphatic hydrops (ELH) are highly dependent on a reliable segmentation of the inner ear's total fluid space (TFS). This study aimed to develop a novel open-source inner ear TFS segmentation approach using a dedicated deep learning (DL) model. Methods The model was based on a V-Net architecture (IE-Vnet) and a multivariate (MR scans: T1, T2, FLAIR, SPACE) training dataset (D1, 179 consecutive patients with peripheral vestibulocochlear syndromes). Ground-truth TFS masks were generated in a semi-manual, atlas-assisted approach. IE-Vnet model segmentation performance, generalizability, and robustness to domain shift were evaluated on four heterogenous test datasets (D2-D5, n = 4 × 20 ears). Results The IE-Vnet model predicted TFS masks with consistently high congruence to the ground-truth in all test datasets (Dice overlap coefficient: 0.9 ± 0.02, Hausdorff maximum surface distance: 0.93 ±Background In-vivo MR-based high-resolution volumetric quantification methods of the endolymphatic hydrops (ELH) are highly dependent on a reliable segmentation of the inner ear's total fluid space (TFS). This study aimed to develop a novel open-source inner ear TFS segmentation approach using a dedicated deep learning (DL) model. Methods The model was based on a V-Net architecture (IE-Vnet) and a multivariate (MR scans: T1, T2, FLAIR, SPACE) training dataset (D1, 179 consecutive patients with peripheral vestibulocochlear syndromes). Ground-truth TFS masks were generated in a semi-manual, atlas-assisted approach. IE-Vnet model segmentation performance, generalizability, and robustness to domain shift were evaluated on four heterogenous test datasets (D2-D5, n = 4 × 20 ears). Results The IE-Vnet model predicted TFS masks with consistently high congruence to the ground-truth in all test datasets (Dice overlap coefficient: 0.9 ± 0.02, Hausdorff maximum surface distance: 0.93 ± 0.71 mm, mean surface distance: 0.022 ± 0.005 mm) without significant difference concerning side (two-sided Wilcoxon signed-rank test, p>0.05), or dataset (Kruskal-Wallis test, p>0.05; post-hoc Mann-Whitney U, FDR-corrected, all p>0.2). Prediction took 0.2 s, and was 2,000 times faster than a state-of-the-art atlas-based segmentation method. Conclusion IE-Vnet TFS segmentation demonstrated high accuracy, robustness toward domain shift, and rapid prediction times. Its output works seamlessly with a previously published open-source pipeline for automatic ELS segmentation. IE-Vnet could serve as a core tool for high-volume trans-institutional studies of the inner ear. Code and pre-trained models are available free and open-source under https://github.com/pydsgz/IEVNet.…