In the beginning was the word: LLM-VaR and LLM-ES

  • This study introduces LLM-VaR and LLM-ES, novel risk estimation metrics that utilize general-purpose large language models (LLMs) for the forecasting tasks of Value at Risk (VaR) and Expected Shortfall (ES) in a zero-shot setting. Building on the input encoding mechanism of the LLMTime framework, we extend its application by defining new financial risk measures and performing an empirical evaluation of three generations of GPT models, GPT-3.5, GPT-4 and GPT-4o, versus advanced benchmark models such as GARCH with Student innovations and EWMA with Dynamic Conditional Score (DCS). Financial time series are encoded as numerical strings, allowing for model-free inference without requiring retraining. Results show that LLMs perform well when short rolling windows are used, particularly in volatile markets like cryptocurrencies. GPT-3.5 frequently outperforms or matches the performance of newer models, raising questions about model complexity, alignment, and biases. In contrast, performanceThis study introduces LLM-VaR and LLM-ES, novel risk estimation metrics that utilize general-purpose large language models (LLMs) for the forecasting tasks of Value at Risk (VaR) and Expected Shortfall (ES) in a zero-shot setting. Building on the input encoding mechanism of the LLMTime framework, we extend its application by defining new financial risk measures and performing an empirical evaluation of three generations of GPT models, GPT-3.5, GPT-4 and GPT-4o, versus advanced benchmark models such as GARCH with Student innovations and EWMA with Dynamic Conditional Score (DCS). Financial time series are encoded as numerical strings, allowing for model-free inference without requiring retraining. Results show that LLMs perform well when short rolling windows are used, particularly in volatile markets like cryptocurrencies. GPT-3.5 frequently outperforms or matches the performance of newer models, raising questions about model complexity, alignment, and biases. In contrast, performance deteriorates with longer windows, where the econometric models prove more reliable. Our findings demonstrate the potential of general-purpose LLMs as adaptive tools for short-horizon financial risk assessment and contribute a first-of-its-kind benchmark for LLM-based VaR/ES estimation.show moreshow less

Download full text files

Export metadata

Statistics

Number of document requests

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Daniel Traian Pele, Vlad Bolovăneanu, Min-Bin Lin, Rui RenORCiDGND, Andrei Theodor Ginavar, Bruno Spilak, Alexandru-Victor Andrei, Filip-Mihai Toma, Stefan Lessmann, Wolfgang Karl Härdle
Frontdoor URLhttps://opus.bibliothek.uni-augsburg.de/opus4/123017
ISSN:0957-4174OPAC
Parent Title (English):Expert Systems with Applications
Publisher:Elsevier BV
Place of publication:Amsterdam
Type:Article
Language:English
Year of first Publication:2025
Publishing Institution:Universität Augsburg
Release Date:2025/06/25
First Page:128676
DOI:https://doi.org/10.1016/j.eswa.2025.128676
Institutes:Wirtschaftswissenschaftliche Fakultät
Wirtschaftswissenschaftliche Fakultät / Institut für Statistik und mathematische Wirtschaftstheorie
Wirtschaftswissenschaftliche Fakultät / Institut für Statistik und mathematische Wirtschaftstheorie / Lehrstuhl für Statistik
Dewey Decimal Classification:3 Sozialwissenschaften / 33 Wirtschaft / 330 Wirtschaft
Latest Publications (not yet published in print):Aktuelle Publikationen (noch nicht gedruckt erschienen)
Licence (German):CC-BY 4.0: Creative Commons: Namensnennung (mit Print on Demand)