Statistical inference after variable selection in Cox models: a neutral simulation study

  • Choosing relevant predictors is central to the analysis of biomedical time-to-event data. Classical frequentist inference, however, presumes that the set of covariates is fixed in advance and does not account for data-driven variable selection. As a consequence, naive post-selection inference may be biased and misleading. In right-censored survival settings, these issues may be further exacerbated by the additional uncertainty induced by censoring. We investigate several inference procedures applied after variable selection for the coefficients of the Lasso and its extension, the adaptive Lasso, in the context of the Cox model. The methods considered include sample splitting, post-selection inference procedures that condition explicitly on the Lasso selection event, and the debiased Lasso. Because these methods address different inferential targets, we distinguish selected-submodel targets from full-model targets and interpret empirical coverage, interval width, power, and type I errorChoosing relevant predictors is central to the analysis of biomedical time-to-event data. Classical frequentist inference, however, presumes that the set of covariates is fixed in advance and does not account for data-driven variable selection. As a consequence, naive post-selection inference may be biased and misleading. In right-censored survival settings, these issues may be further exacerbated by the additional uncertainty induced by censoring. We investigate several inference procedures applied after variable selection for the coefficients of the Lasso and its extension, the adaptive Lasso, in the context of the Cox model. The methods considered include sample splitting, post-selection inference procedures that condition explicitly on the Lasso selection event, and the debiased Lasso. Because these methods address different inferential targets, we distinguish selected-submodel targets from full-model targets and interpret empirical coverage, interval width, power, and type I error accordingly. Their performance is examined in a neutral simulation study reflecting realistic covariate structures and censoring rates commonly encountered in biomedical applications. The primary focus is post-selection inference after Cox-Lasso variable selection, not a comprehensive benchmark of very-high-dimensional variable-selection performance. To complement the simulation results, we illustrate the practical behavior of these procedures in an applied example using a publicly available survival dataset.show moreshow less

Download full text files

Export metadata

Statistics

Number of document requests

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Lena SchemetORCiD, Sarah Friedrich-WelzORCiD
URN:urn:nbn:de:bvb:384-opus4-1314195
Frontdoor URLhttps://opus.bibliothek.uni-augsburg.de/opus4/131419
ISSN:1471-2288OPAC
Parent Title (English):BMC Medical Research Methodology
Publisher:BioMed Central
Place of publication:London
Type:Article
Language:English
Date of first Publication:2026/06/23
Publishing Institution:Universität Augsburg
Release Date:2026/06/25
Tag:Cox model; Debiased Lasso; Lasso; Post-selection inference; Survival analysis
Volume:26
Issue:1
First Page:143
DOI:https://doi.org/10.1186/s12874-026-02887-0
Institutes:Mathematisch-Naturwissenschaftlich-Technische Fakultät
Fakultätsübergreifende Institute und Einrichtungen
Mathematisch-Naturwissenschaftlich-Technische Fakultät / Institut für Mathematik
Mathematisch-Naturwissenschaftlich-Technische Fakultät / Institut für Mathematik / Lehrstuhl für Mathematical Statistics and Artificial Intelligence in Medicine
Fakultätsübergreifende Institute und Einrichtungen / Zentrum für Advanced Analytics and Predictive Sciences (CAAPS)
Dewey Decimal Classification:5 Naturwissenschaften und Mathematik / 51 Mathematik / 510 Mathematik
Licence (German):CC-BY 4.0: Creative Commons: Namensnennung