A Text-Based Approach to the ImageCLEF 2010 Photo Annotation Task

jeni martha

Outline

Title

Abstract

Introduction

Metadata Processing and Retrieval Strategies

Conclusion

References

A Text-Based Approach to the ImageCLEF 2010 Photo Annotation Task

jeni martha

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

The challenges of searching the increasingly large collections of digital images which are appearing in many places mean that automated annotation of images is becoming an important task. We describe our participation in the ImageCLEF 2010 Visual Concept Detection and Annotation Task. Our approach used only the textual features (Flickr user tags and EXIF information) to perform the automatic annotation. Our approach was to explore the use of different techniques to improve the results of textual annotation. We identify the drawbacks of our approach and how these might be addressed and optimized in further work.

Mauricio Villegas

2012

The ImageCLEF 2012 Scalable Image Annotation Using Gen- eral Web Data Task proposed a challenge, in which as training data instead of relying only on a set of manually annotated images, the ob- jective was to make use of automatically gathered Web data, with the aim of developing more scalable image annotation systems. To this end, the participants were provided with a new dataset, composed of 250,000 images for training, which included various visual feature types, and tex- tual features obtained from the websites in which the images appeared. Two subtasks were defined. The first subtask employed the same test set as the ImageCLEF 2012 Flickr Photo Annotation subtask, with the particularity that both the Flickr and Web training sets had to be used. The idea was to determine if the Web data could help to enhance the annotation performance in comparison to using only manually annotated data. The second subtask consisted in using only automatically gathered Web data to develop an imag...

downloadDownload free PDF View PDFchevron_right

The Wroclaw University of Technology Participation at ImageCLEF 2010 Photo Annotation Track

Halina Kwasnicka

2010

Abstract. In this paper we present three methods for image autoannotation used by the Wroclaw University of Technology group at ImageCLEF 2010 Photo Annotation track. All of our experiments focus on robustness of the global color and texture image features in connection with different similarity measures. To annotate training set we use two version of PATSI algorithm which searches for the most similar images and transferring annotations from them to the target image by applying transfer function.

downloadDownload free PDF View PDFchevron_right

Visual Concept Features and Textual Expansion in a Multimodal System for Concept Annotation and Retrieval with Flickr Photos at ImageCLEF2012

Ana M Garcia-Serrano

This paper presents our submitted experiments in the Concept annotation and Concept Retrieval tasks using Flickr photos at ImageCLEF 2012. This edition we applied new strategies for both the textual and the visual subsystems included in our multimodal retrieval system. The visual subsystem has focus on extending the low-level features vector with concept features. These concept features have been calculated by means of a logistic regression model. The textual subsystem has focus on expanding the query information using external resources. Our best concept retrieval run, a multimodal one, is at the ninth position with a MnAP of 0.0295, being the second best group of the contest for the multimodal modality. This is also our best run in the global ordered list (where eleven textual runs are also better than it). We have adapted our multimodal retrieval process for the annotation task obtaining non-very good results for this first participation, with a MiAP of 0.1020.

downloadDownload free PDF View PDFchevron_right

A Framework For Annotating Images and its Respective Tags.

Ijesrt Journal

International Journal of Engineering Sciences & Research Technology, 2014

The vast resource of pictures available on the web and the fact that many of them naturally co-occur with topically related documents and are captioned we focus on the task of automatically generating captions for images, here the model learns to create captions from a database of news articles, and the pictures embedded in them, and their captions, and consists of two stages. Content selection identifies what the image and accompanying article are about, whereas surface realization determines how to verbalize the chosen content. We approximate content selection with a probabilistic image annotation model that suggests keywords for an image. In the Proposed system extensive features are extracted from the database images and stored in the feature library. The extensive features set is comprised of shape features along with the color, texture and the contour let features, which are utilized in this work. When a query image is given, the features are extracted in the similar fashion. Subsequently, GA-based similarity measure is performed between the query image features and the database image features.

downloadDownload free PDF View PDFchevron_right

LIRIS - Imagine at Image CLEF 2012 Photo Annotation Task

Charles-edmond Bichot

HAL (Le Centre pour la Communication Scientifique Directe), 2012

In this paper, we present the methods we have proposed and evaluated through the ImageCLEF 2012 Photo Annotation task. More precisely, we have proposed the Histogram of Textual Concepts (HTC) textual feature to capture the relatedness of semantic concepts. In contrast to term frequency-based text representations mostly used for visual concept detection and annotation, HTC relies on the semantic similarity between the user tags and a concept dictionary. Moreover, a Selective Weighted Late Fusion (SWLF) is introduced to combine multiple sources of information which by iteratively selecting and weighting the best features for each concept at hand to be classified. The results have shown that the combination of our HTC feature with visual features through SWLF can improve the performance significantly. Our best model, which is a late fusion of textual and visual features, achieved a MiAP (Mean interpolated Average Precision) of 43.67% and ranked first out of the 80 submitted runs.

downloadDownload free PDF View PDFchevron_right

Overview of the ImageCLEF 2006 Photographic Retrieval and Object Annotation Tasks

Henning Müller, Paul Clough

Lecture Notes in Computer Science, 2007

This paper describes the general photographic retrieval and object annotation tasks of the ImageCLEF 2006 evaluation campaign. These tasks provide both the resources and the framework necessary to perform comparative laboratory-style evaluation of visual information systems for image retrieval and automatic image annotation. Both tasks offer something new for 2006 and attracted a large number of submissions: 12 groups participating in ImageCLEFphoto and 3 in the automatic annotation task. This paper summarises components used in the benchmark, including the collections, the search and annotation tasks, the submissions from participating groups, and results. The general photographic retrieval task, ImageCLEFphoto, used a new collection -the IAPR-TC12 Benchmark -of 20,000 colour photographs with semi-structured captions in English and German. This new collection replaces the St Andrews collection of historic photographs used for the previous three years. For ImageCLEFphoto groups submitted mainly text-only runs. However, 31% of runs involved some kind of visual retrieval technique, typically combined with text through the merging of image and text retrieval results. Bilingual text retrieval was performed using two target languages: English and German, with 59% of runs bilingual. Highest monolingual of English was shown to be 74% for Portuguese-English and 39% of German for English-German. Combined text and retrieval approaches were seen to give, on average, higher retrieval results (+54%) than using text (or image) retrieval alone. Similar to previous years, the use of relevance feedback (most commonly in the form of pseudo relevance feedback) to enable query expansion was seen to improve the text-based submissions by an average of 39%. Topics have been categorised and analysed with respect to various attributes including an estimation of their "visualness" and linguistic complexity.

downloadDownload free PDF View PDFchevron_right

Overview of the imageclef 2013 scalable concept image annotation subtask

Mauricio Villegas

The ImageCLEF 2013 Scalable Concept Image Annotation Subtask was the second edition of a challenge aimed at developing more scalable image annotation systems. Unlike traditional image annotation challenges, which rely on a set of manually annotated images as training data for each concept, the participants were only allowed to use automatically gathered web data instead. The main objective of the challenge was to focus not only on the image annotation algorithms developed by the participants, where given an input image and a set of concepts they were asked to decide which of them were present in the image and which ones were not, but also on the scalability of their systems, such that the concepts to detect were not exactly the same between the development and test sets. The participants were provided with web data consisting of 250,000 images, which included textual features obtained from the web pages on which the images appeared, as well as various visual features extracted from the images themselves. To evaluate the performance of the submitted systems a development set was provided containing 1,000 images that were manually annotated for 95 concepts and a test set containing 2,000 images that were annotated for 116 concepts. In total 13 teams participated, submitting a total of 58 runs, most of which significantly outperformed the baseline system for both the development and test sets, including for the test concepts not present in the development set and thus clearly demonstrating potential for scalability.

downloadDownload free PDF View PDFchevron_right

A multimedia IR-based system for the Photo Annotation Task at ImageCLEF2013

X. Benavent

The UNED-UV group at the ImageCLEF2013 Campaign have participated in the Scalable Concept Image Annotation subtask. We present a multimedia IR-based system for the annotation task. In this collection, the images do not have any textual description associated, so we have downloaded and preprocessed the web pages which contain the images. Regarding the concepts, we expanded their textual description with additional information from external resources as Wikipedia or WordNet and we generate a KLD concept model using recovered textual information. The multimedia IR-based system uses a logistic relevance algorithm to get a model for each of the concepts to be trained using visual image features. Finally, the fusion subsystem merges textual and visual scores for a certain image to belong a concept, and decides the presence of the concept in the images.

downloadDownload free PDF View PDFchevron_right

The joint submission of the TU Berlin and Fraunhofer FIRST (TUBFI) to the ImageCLEF2011 Photo Annotation Task

Alexander Binder, Klaus-robert Müller

2011

In this paper we present details on the joint submission of TU Berlin and Fraunhofer FIRST to the ImageCLEF 2011 Photo Annotation Task. We sought to experiment with extensions of Bag-of-Words (BoW) models at several levels and to apply several kernel-based learning methods recently developed in our group. For classifier training we used non-sparse multiple kernel learning (MKL) and an efficient multi-task learning (MTL) heuristic based on MKL over kernels from classifier outputs. For the multi-modal fusion we used a smoothing method on tag-based features inspired by Bag-of-Words soft mappings and Markov random walks. We submitted one multi-modal run extended by the user tags and four purely visual runs based on Bag-of-Words models. Our best visual result which used the MTL method was ranked first according to mean average precision (MAP) within the purely visual submissions. Our multi-modal submission achieved the first rank by MAP among the multi-modal submissions and the best MAP among all submissions. Submissions by other groups such as BPACAD, CAEN, UvA-ISIS, LIRIS were ranked closely.

downloadDownload free PDF View PDFchevron_right

LIRIS-Imagine at Image-CLEF 2011 Photo Annotation task

Stephane bres

Working Notes of CLEF, 2011

In this paper, we focus on one of the ImageCLEF tasks that LIRIS-Imagine research group participated: visual concept detection and annotation. For this task, we firstly propose two kinds of textual features to extract semantic meanings from text associated to images: one is based on semantic distance matrix between the text and a semantic dictionary, and the other one carries the valence and arousal meanings by making use of the Affective Norms for English Words (ANEW) dataset. Meanwhile, we investigate efficiency of different visual features including color, texture, shape, high level features, and we test four fusion methods to combine various features to improve the performance including min, max, mean and score. The results have shown that combination of our textural features and visual features can improve the performance significantly.

downloadDownload free PDF View PDFchevron_right

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

References (8)

J. Min, J. Leveling, G.J.F.Jones: Document Expansion for Image Retrieval. RIAO conference, 2009
MIRFLICKR Image Collection Website. http://press.liacs.nl/mirflickr/
Tsikrika, T., Kludas, J.: Overview of the WikipediaMM Task at ImageCLEF 2009. In: Working Notes for the CLEF 2009 Workshop, Corfu, Greece (2009)
Jiquan Ngiam, Hanlin Goh: I2R ImageCLEF Photo Annotation 2009 Working Notes. ImageCLEF task 2009
Supheakmungkol SARIN, Wataru KAMEYAMA: Joint Equal Contribution of Global and Local Features for Image Annotation. ImageCLEF Photo Annotation task 2009 working note.
J. Min, P. Wilkins, J. Leveling, and G.J.F.Jones: DCU at WikipediaMM 2009: Document expansion from wikipedia abstracts. In Working Notes for the CLEF 2009 Workshop, Corfu, Greece, 30 September to 2 October, 2009.
A. W. M. Smeulders, M. Worring, S. Santini, A. Gupta, R. Jain: Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence.
String Metrics Introduction. Wikipedia

Adrian Popescu

This paper describes our participation to the ImageCLEF2012 Photo Annotation Task. We focus on how to use the tags associated to the images to improve the annotation performance. We submitted one textual-only and three multimodal runs. Our first textual model [14] is based on the local soft coding of images tags over a dictionary of most frequent tags. A second model of tag is an adaptation of the TF-IDF model to the social space in order to compute the social relatedness of two tags[9]. For the fusion we used a trainable combiner, called stacked generalization [12] which uses predictions from base classifiers to learn a new model. Results have shown that combination of textual and visual features can improve the annotation performance significantly. Our best run achieves 41.59 % in terms of MAP, allowing us to rank 3 rd team.

downloadDownload free PDF View PDFchevron_right

Explorations in automatic image annotation using textual features

Chee Wee (Ben) Leong

Proceedings of the Third Linguistic …, 2009

In this paper, we report our work on automatic image annotation by combining several textual features drawn from the text surrounding the image. Evaluation of our system is performed on a dataset of images and texts collected from the web. We report our findings through comparative evaluation with two gold standard collections of manual annotations on the same dataset.

downloadDownload free PDF View PDFchevron_right

The clef 2011 photo annotation and concept-based retrieval tasks

Judith Liebetrau

CLEF (Notebook Papers/Labs/ …, 2011

The ImageCLEF 2011 Photo Annotation and Concept-based Retrieval Tasks pose the challenge of an automated annotation of Flickr images with 99 visual concepts and the retrieval of images based on query topics. The participants were provided with a training set of 8,000 images including annotations, EXIF data, and Flickr user tags. The annotation challenge was performed on 10,000 images, while the retrieval challenge considered 200,000 images. Both tasks differentiate among approaches that consider solely visual information, approaches that rely only on textual information in form of image metadata and user tags, and multi-modal approaches that combine both information sources. The relevance assessments were acquired with a crowdsourcing approach and the evaluation followed two evaluation paradigms: per concept and per example. In total, 18 research teams participated in the annotation challenge with 79 submissions. The concept-based retrieval task was tackled by 4 teams that submitted a total of 31 runs. Summarizing the results, the annotation task could be solved with a MiAP of 0.443 in the multimodal configuration, with a MiAP of 0.388 in the visual configuration, and with a MiAP of 0.346 in the textual configuration. The conceptbased retrieval task was solved best with a MAP of 0.164 using multimodal information and a manual intervention in the query formulation. The best completely automated approach achieved 0.085 MAP and uses solely textual information. Results indicate that while the annotation task shows promising results, the concept-based retrieval task is much harder to solve, especially for specific information needs.

downloadDownload free PDF View PDFchevron_right

Automatic Image Annotation

Adrian Iftene

In the recent years, multimedia content has grown increasingly over the Internet, especially in social networks, where users often post images using their mobile devices. In these networks such as Flickr, the content is later used in search operations when some users want to find something using a specific query. Nowadays, searching into these networks is primarily made using the title and the keywords associated to resources added by users that have posted the content. The problem we face comes from the fact that in many cases, the title or the related keywords are not relevant to the resource and only after we analyse the image, can we conclude what it contains in reality. The project that we want to present in this article proposes that each image is connected to relevant keywords according to its content. In order to do this, the first step was to create a collection of images that was annotated by human annotators, while the second step was to expand this collection of images performing search on the Internet using keywords associated to the initial collection of annotated images. Currently, for a new picture, we can identify similar images in our collection of images and based on the keywords associated with them, we can determine what keywords characterize this new image. The evaluation of this system has demonstrated that our approach works efficiently for images for which we can find similar images in our collection.

downloadDownload free PDF View PDFchevron_right

Text analysis for automatic image annotation

Marie-Francine Moens

2007

We present a novel approach to automatically annotate images using associated text. We detect and classify all entities (persons and objects) in the text after which we determine the salience (the importance of an entity in a text) and visualness (the extent to which an entity can be perceived visually) of these entities. We combine these measures to compute the probability that an entity is present in the image. The suitability of our approach was successfully tested on 100 image-text pairs of Yahoo! News.

downloadDownload free PDF View PDFchevron_right

CEA LIST’s participation to Visual Concept Detection Task of ImageCLEF 2011

Adrian Popescu

2011

This paper describes the CEA LIST participation in the ImageCLEF 2011 Photo Annotation challenge. This year, our motivation was to investigate the annotation performance by using provided Flickr-tags as additionnal information. First, we present an overview of our local and global visual features used in this work. Second, we present a new method, that we call "Fuzzy-tfidf", which takes into account the uncertainty of user tags. Our textual descriptor is based on semantic similarity between tags and visual concepts. To compute this similarity, we used two distances: the first one is based on Wordnet ontology and the second is based on social networks. We perform a late fusion to combine scores from visual and textual modalities. Our best model, a late fusion trained on global visual features and user tags, obtains 38.3 % MAP, almost a 8 % MAP absolute improvement compared to our best visual-only system. The results show that the combination of Flickr-tags with visual features improves the results of the run using only visual features. It corroborates the importance of taking into account the uncertainty of user tags and the complementarity between visual and textual modalities.

downloadDownload free PDF View PDFchevron_right

Overview of the ImageCLEF 2016 Scalable Concept Image Annotation Task

Mauricio Villegas

Cross-Language Evaluation Forum, 2016

Since 2010, ImageCLEF has run a scalable image annotation task, to promote research into the annotation of images using noisy web page data. It aims to develop techniques to allow computers to describe images reliably, localise different concepts depicted and generate descriptions of the scenes. The primary goal of the challenge is to encourage creative ideas of using web page data to improve image annotation. Three subtasks and two pilot teaser tasks were available to participants; all tasks use a single mixed modality data source of 510,123 web page items for both training and test. The dataset included raw images, textual features obtained from the web pages on which the images appeared, as well as extracted visual features. Extracted from the Web by querying popular image search engines, the dataset was formed. For the main subtasks, the development and test sets were both taken from the "training set". For the teaser tasks, 200,000 web page items were reserved for testing, and a separate development set was provided. The 251 concepts were chosen to be visual objects that are localizable and that are useful for generating textual descriptions of the visual content of images and were mined from the texts of our extensive database of image-webpage pairs. This year seven groups participated in the task, submitting over 50 runs across all subtasks, and all participants also provided working notes papers. In general, the groups' performance is impressive across the tasks, and there are interesting insights into these very relevant challenges.

downloadDownload free PDF View PDFchevron_right

Semi-Automatic Image Annotation

Brent Field

2001

A novel approach to semi-automatically and progressively annotating images with keywords is presented. The progressive annotation process is embedded in the course of integrated keyword-based and content-based image retrieval and user feedback. When the user submits a keyword query and then provides relevance feedback, the search keywords are automatically added to the images that receive positive feedback and can then facilitate keyword-based image retrieval in the future. The coverage and quality of image annotation in such a database system is improved progressively as the cycle of search and feedback increases. The strategy of semi-automatic image annotation is better than manual annotation in terms of efficiency and better than automatic annotation in terms of accuracy. A performance study is presented which shows that high annotation coverage can be achieved with this approach, and a preliminary user study is described showing that users view annotations as important and will ...

downloadDownload free PDF View PDFchevron_right

Text mining for automatic image tagging

Chee Wee (Ben) Leong

Proceedings of the 23rd …, 2010

This paper introduces several extractive approaches for automatic image tagging, relying exclusively on information mined from texts. Through evaluations on two datasets, we show that our methods exceed competitive baselines by a large margin, and compare favorably with the stateof-the-art that uses both textual and image features.

downloadDownload free PDF View PDFchevron_right

A Text-Based Approach to the ImageCLEF 2010 Photo Annotation Task

Sign up for access to the world's latest research

Abstract

Related papers

References (8)

Related papers