Automatic image tagging for corpus linguistics: a multimodal study of news representations of Islam
- This Element reports on the creation and analysis of a 1.5-million-word corpus consisting of a year's worth of UK national press news articles about Islam and Muslims, published between December 2022 and November 2023. The corpus also contains 8,546 image files which have been automatically tagged using Google's Vertex AI. Analysis was carried out on three levels a) written text only, b) images only, c) interactions between written text and images. Using examples from the analyses, the authors demonstrate the affordances of these three approaches, providing a critical evaluation of Vertex AI's capabilities and the abilities of popular corpus software to work with visually tagged corpora. The Element acts as a practical guide for researchers who want to carry out this form of analysis. This title is also available as Open Access on Cambridge Core.
| Author: | Paul Baker, Hanna SchmückORCiDGND, Yufang Qian |
|---|---|
| URN: | urn:nbn:de:bvb:384-opus4-1256037 |
| Frontdoor URL | https://opus.bibliothek.uni-augsburg.de/opus4/125603 |
| ISBN: | 9781009581233OPAC |
| ISBN: | 9781009581240OPAC |
| ISBN: | 9781009581257OPAC |
| Publisher: | Cambridge University Press |
| Place of publication: | Cambridge |
| Type: | Book |
| Language: | English |
| Year of first Publication: | 2025 |
| Publishing Institution: | Universität Augsburg |
| Release Date: | 2025/10/02 |
| Page Number: | 82 |
| Series: | Elements in Corpus Linguistics |
| DOI: | https://doi.org/10.1017/9781009581233 |
| Institutes: | Fakultät für Angewandte Informatik |
| Fakultät für Angewandte Informatik / Institut für Informatik | |
| Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Computerlinguistik | |
| Licence (German): | CC-BY-NC 4.0: Creative Commons: Namensnennung - Nicht kommerziell |



