- Many content-based image mining systems extract local features from images to obtain an image description based on discrete feature occurrences. Such applications require a visual vocabulary also known as visual codebook or visual dictionary to discretize the extracted high-dimensional features to visual words in an efficient yet accurate way. Once such an application operates on images of a very specific domain the question arises if a vocabulary built from those domain-specific images needs to be used or if a "universal" visual vocabulary can be used instead. A universal visual vocabulary may be computed from images of a different domain once and then be re-used for various applications and other domains. We therefore evaluate several visual vocabularies from different image domains by determining their performance at pLSA-based image classification on several datasets. We empirically conclude that vocabularies suit our classification tasks equally well disregarding the image domainMany content-based image mining systems extract local features from images to obtain an image description based on discrete feature occurrences. Such applications require a visual vocabulary also known as visual codebook or visual dictionary to discretize the extracted high-dimensional features to visual words in an efficient yet accurate way. Once such an application operates on images of a very specific domain the question arises if a vocabulary built from those domain-specific images needs to be used or if a "universal" visual vocabulary can be used instead. A universal visual vocabulary may be computed from images of a different domain once and then be re-used for various applications and other domains. We therefore evaluate several visual vocabularies from different image domains by determining their performance at pLSA-based image classification on several datasets. We empirically conclude that vocabularies suit our classification tasks equally well disregarding the image domain they were derived from.…

