Automatic image tagging for corpus linguistics: a multimodal study of news representations of Islam

This Element reports on the creation and analysis of a 1.5-million-word corpus consisting of a year's worth of UK national press news articles about Islam and Muslims, published between December 2022 and November 2023. The corpus also contains 8,546 image files which have been automatically tagged using Google's Vertex AI. Analysis was carried out on three levels a) written text only, b) images only, c) interactions between written text and images. Using examples from the analyses, the authors demonstrate the affordances of these three approaches, providing a critical evaluation of Vertex AI's capabilities and the abilities of popular corpus software to work with visually tagged corpora. The Element acts as a practical guide for researchers who want to carry out this form of analysis. This title is also available as Open Access on Cambridge Core.

Metadaten
Author:	Paul Baker, Hanna Schmück ORCiD GND, Yufang Qian
URN:	urn:nbn:de:bvb:384-opus4-1256037
Frontdoor URL	https://opus.bibliothek.uni-augsburg.de/opus4/125603
ISBN:	9781009581233OPAC
ISBN:	9781009581240OPAC
ISBN:	9781009581257OPAC
Publisher:	Cambridge University Press
Place of publication:	Cambridge
Type:	Book
Language:	English
Year of first Publication:	2025
Publishing Institution:	Universität Augsburg
Release Date:	2025/10/02
Page Number:	82
Series:	Elements in Corpus Linguistics
DOI:	https://doi.org/10.1017/9781009581233
Institutes:	Fakultät für Angewandte Informatik
	Fakultät für Angewandte Informatik / Institut für Informatik
	Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Computerlinguistik
Licence (German):	CC-BY-NC 4.0: Creative Commons: Namensnennung - Nicht kommerziell

Open Access