Coherent multi-sentence video description with variable level of detail
- Humans can easily describe what they see in a coherent way and at varying level of detail. However, existing approaches for automatic video description focus on generating only single sentences and are not able to vary the descriptions’ level of detail. In this paper, we address both of these limitations: for a variable level of detail we produce coherent multi-sentence descriptions of complex videos. To understand the difference between detailed and short descriptions, we collect and analyze a video description corpus of three levels of detail. We follow a two-step approach where we first learn to predict a semantic representation (SR) from video and then generate natural language descriptions from it. For our multi-sentence descriptions we model across-sentence consistency at the level of the SR by enforcing a consistent topic. Human judges rate our descriptions as more readable, correct, and relevant than related work.
Author: | Anna Rohrbach, Marcus Rohrbach, Wei Qiu, Annemarie FriedrichORCiDGND, Manfred Pinkal, Bernt Schiele |
---|---|
Frontdoor URL | https://opus.bibliothek.uni-augsburg.de/opus4/105700 |
ISBN: | 978-3-319-11751-5OPAC |
ISBN: | 978-3-319-11752-2OPAC |
ISSN: | 0302-9743OPAC |
ISSN: | 1611-3349OPAC |
Parent Title (English): | Pattern Recognition: 36th German Conference, GCPR 2014, Münster, Germany, September 2-5, 2014, Proceedings |
Publisher: | Springer |
Place of publication: | Cham |
Editor: | Xiaoyi Jiang, Joachim Hornegger, Reinhard Koch |
Type: | Conference Proceeding |
Language: | English |
Year of first Publication: | 2014 |
Release Date: | 2023/07/10 |
First Page: | 184 |
Last Page: | 195 |
Series: | Lecture Notes in Computer Science ; 8753 |
DOI: | https://doi.org/10.1007/978-3-319-11752-2_15 |
Institutes: | Fakultät für Angewandte Informatik |
Fakultät für Angewandte Informatik / Institut für Informatik | |
Fakultät für Angewandte Informatik / Institut für Informatik / Professur für Sprachverstehen mit der Anwendung Digital Humanities |