- We present a multi-modal stress dataset that uses digital job interviews to induce stress. The dataset provides multi-modal data of 40 participants including audio, video (motion capturing, facial landmarks, eye tracking) as well as physiological information (photoplethysmography, electrodermal activity). In addition to that, the dataset contains time-continuous annotations for stress and occurred emotions (e.g., shame, anger, anxiety, and surprise). In order to establish a baseline, five different machine learning classifiers (Support Vector Machine, K-Nearest Neighbors, Random Forest, Feed-forward Neural Network, and Long-Short-Term Memory Network) have been trained and evaluated on the presented dataset for a binary stress classification task. The best-performing classifier has been a Long-Short-Term Memory Network, which achieved an accuracy of 91.7% and an F1-score of 90.2%. The ForDigitStress dataset is freely available to other researchers.