Aggregating Local Features into Bundles for High-Precision Object Retrieval

  • Due to the omnipresence of digital cameras and mobile phones the number of images stored in image databases has grown tremendously in the last years. It becomes apparent that new data management and retrieval techniques are needed to deal with increasingly large image databases. This thesis presents new techniques for content-based image retrieval where the image content itself is used to retrieve images by visual similarity from databases. We focus on the query-by-example scenario, assuming the image itself is provided as query to the retrieval engine. In many image databases, images are often associated with metadata, which may be exploited to improve the retrieval performance. In this work, we present a technique that fuses cues from the visual domain and textual annotations into a single compact representation. This combined multimodal representation performs significantly better compared to the underlying unimodal representations, which we demonstrate on two large-scale imageDue to the omnipresence of digital cameras and mobile phones the number of images stored in image databases has grown tremendously in the last years. It becomes apparent that new data management and retrieval techniques are needed to deal with increasingly large image databases. This thesis presents new techniques for content-based image retrieval where the image content itself is used to retrieve images by visual similarity from databases. We focus on the query-by-example scenario, assuming the image itself is provided as query to the retrieval engine. In many image databases, images are often associated with metadata, which may be exploited to improve the retrieval performance. In this work, we present a technique that fuses cues from the visual domain and textual annotations into a single compact representation. This combined multimodal representation performs significantly better compared to the underlying unimodal representations, which we demonstrate on two large-scale image databases consisting of up to 10 million images. The main focus of this work is on feature bundling for object retrieval and logo recognition. We present two novel feature bundling techniques that aggregate multiple local features into a single visual description. In contrast to many other works, both approaches encode geometric information about the spatial layout of local features into the corresponding visual description itself. Therefore, these descriptions are highly distinctive and suitable for high-precision object retrieval. We demonstrate the use of both bundling techniques for logo recognition. Here, the recognition is performed by the retrieval of visually similar images from a database of reference images, making the recognition systems easily scalable to a large number of classes. The results show that our retrieval-based methods can successfully identify small objects such as logos with an extremely low false positive rate. In particular, our feature bundling techniques are beneficial because false positives are effectively avoided upfront due to the highly distinctive descriptions. We further demonstrate and thoroughly evaluate the use of our bundling technique based on min-Hashing for image and object retrieval. Compared to approaches based on conventional bag-of-words retrieval, it has much higher efficiency: the retrieved result lists are shorter and cleaner while recall is on equal level. The results suggest that this bundling scheme may act as pre-filtering step in a wide range of scenarios and underline the high effectiveness of this approach. Finally, we present a new variant for extremely fast re-ranking of retrieval results, which ranks the retrieved images according to the spatial consistency of their local features to those of the query image. The demonstrated method is robust to outliers, performs better than existing methods and allows to process several hundreds to thousands of images per second on a single thread.show moreshow less

Download full text files

Export metadata

Statistics

Number of document requests

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Stefan RombergGND
URN:urn:nbn:de:bvb:384-opus4-26385
Frontdoor URLhttps://opus.bibliothek.uni-augsburg.de/opus4/2638
Advisor:Rainer Lienhart
Type:Doctoral Thesis
Language:English
Publishing Institution:Universität Augsburg
Granting Institution:Universität Augsburg, Fakultät für Angewandte Informatik
Date of final exam:2013/12/13
Release Date:2014/04/28
Tag:content-based image retrieval; object retrieval; logo retrieval; local feature bundling; Bundle min-Hashing
GND-Keyword:Bildbanksystem; Objekterkennung
Institutes:Fakultät für Angewandte Informatik
Fakultät für Angewandte Informatik / Institut für Informatik
Dewey Decimal Classification:0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
Licence (German):Deutsches Urheberrecht