Multimodal pLSA on Visual Features and Tags

This work studies a new approach for image retrieval on largescale community databases. Our proposed system explores two different modalities: visual features and community generated metadata, such as tags. We use topic models to derive a high-level representation appropriate for retrieval for each of our images in the database. We evaluate the proposed approach experimentally in a query-by-example retrieval task and compare our results to systems relying solely on visual features or tag features. It is shown that the proposed multimodal system outperforms the unimodal systems by approximately 36%.

Metadaten
Author:	Stefan Romberg GND, Eva Hörster GND, Rainer Lienhart ORCiD GND
URN:	urn:nbn:de:bvb:384-opus4-11098
Frontdoor URL	https://opus.bibliothek.uni-augsburg.de/opus4/1316
Series (Serial Number):	Reports / Technische Berichte der Fakultät für Angewandte Informatik der Universität Augsburg (2009-09)
Type:	Report
Language:	English
Date of Publication (online):	2009/10/21
Publishing Institution:	Universität Augsburg
Release Date:	2009/10/21
Tag:	SIFT; image retrieval; multimodal pLSA; tags
Institutes:	Fakultät für Angewandte Informatik
	Fakultät für Angewandte Informatik / Institut für Informatik
	Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Maschinelles Lernen und Maschinelles Sehen
Dewey Decimal Classification:	0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
Licence (German):	Deutsches Urheberrecht

Open Access