• search hit 65 of 1750
Back to Result List

Model-X knockoffs in the replication crisis era: reducing false discoveries and researcher bias in social science research

  • The present study addresses problems faced by data-driven social science caused by having too much or not enough data. In particular, an abundance of data or a (sudden) lack thereof makes it challenging to identify the most important predictors in a sea of noise using the most parsimonious and reproducible model possible. In this article, we present the model-X knockoff method, which was introduced by Candès et al. (2018) for reducing the false identification of significant effects due to flexibility-ambiguity issues, to a broader audience, particularly within the social sciences and humanities. Our goal is to provide an accessible starting point and ideally spark interest among researchers in these fields to explore how model-X knockoffs can enhance their work. The findings from a performance contrast simulation indicate that model-X knockoffs select fewer relevant variables than other statistical methods to automatically identify variables, resulting in fewer mistakes. The simulationThe present study addresses problems faced by data-driven social science caused by having too much or not enough data. In particular, an abundance of data or a (sudden) lack thereof makes it challenging to identify the most important predictors in a sea of noise using the most parsimonious and reproducible model possible. In this article, we present the model-X knockoff method, which was introduced by Candès et al. (2018) for reducing the false identification of significant effects due to flexibility-ambiguity issues, to a broader audience, particularly within the social sciences and humanities. Our goal is to provide an accessible starting point and ideally spark interest among researchers in these fields to explore how model-X knockoffs can enhance their work. The findings from a performance contrast simulation indicate that model-X knockoffs select fewer relevant variables than other statistical methods to automatically identify variables, resulting in fewer mistakes. The simulation findings also demonstrate that model-X knockoffs are stable and less sensitive to even small changes in the dataset than other procedures, making them a viable way to reduce researcher degrees of freedom and increase the reproducibility of scientific findings. An additional real data example demonstrates the operational utility of the simulation.show moreshow less

Download full text files

Export metadata

Statistics

Number of document requests

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Jing Zhou, Sebastian ScherrORCiDGND
URN:urn:nbn:de:bvb:384-opus4-1198740
Frontdoor URLhttps://opus.bibliothek.uni-augsburg.de/opus4/119874
ISSN:2590-2911OPAC
Parent Title (English):Social Sciences & Humanities Open
Publisher:Elsevier BV
Place of publication:Amsterdam
Type:Article
Language:English
Year of first Publication:2025
Publishing Institution:Universität Augsburg
Release Date:2025/03/10
Volume:11
First Page:101380
DOI:https://doi.org/10.1016/j.ssaho.2025.101380
Institutes:Philosophisch-Sozialwissenschaftliche Fakultät
Philosophisch-Sozialwissenschaftliche Fakultät / imwk - Institut für Medien, Wissen und Kommunikation
Philosophisch-Sozialwissenschaftliche Fakultät / imwk - Institut für Medien, Wissen und Kommunikation / Lehrstuhl für Digital Health Communication
Dewey Decimal Classification:3 Sozialwissenschaften / 30 Sozialwissenschaften, Soziologie / 300 Sozialwissenschaften
Licence (German):CC-BY 4.0: Creative Commons: Namensnennung (mit Print on Demand)