University of Sheffield
Welcome to a new CLEF track for 2003 called ImageCLEF. This track concerns cross language retrieval of images via their associated textual captions. As a pilot experiment in CLEF, the goal of this track is to explore and study the relationship between images and their captions during the retrieval process. It is likely that this track will appeal to members of more than one research community, including those from image retrieval, cross language retrieval and user interaction. Given queries in languages other than English, the goal is to use whatever method is appropriate to retrieve relevant images from a photographic collection built especially for this puspose task (the Eurovision St Andrews photographic collection, or ESTA).
We propose the following two tasks. Participants of ImageCLEF can attempt either one of the tasks, or both:
As a pilot experiment, we know that there will be unexpected problems with the collection, the topics and evaluation method and we are already aware of limitations with the current resources, such as limited topic translations. We would therefore warmly welcome any recommendations, suggestions for improvements, or help from participants that would make this task of greater benefit to the information retrieval community. By running this track, we hope to stimulate ideas and interaction between ImageCLEF participants (via the ImageCLEF mailing list: email@example.com) in order to further research in cross language image retrieval.
Task 1: automatic ad hoc retrieval
The task is similar to the classic TREC ad hoc retrieval task, in that we simulate the situation in which a system knows the set of documents to be searched, but cannot anticipate the particular topic that will be investigated (i.e. topics are not known to the system in advance). For this task, we provide a list of topic statements and a collection of images with semi-structured captions in English (target language).
The English version of the topics consist of a title (a short sentence or phrase describing the search request in a few words), and a narrative (a description of what constitutes a relevant or non-relevant image for that search request). The narrative also contains an example image and caption, which we envisage could be used for relevance feedback and query-by-example searches.
The titles of each topic have been translated into five European languages: Spanish, Italian, German, French and Dutch (source language) by native speakers, and variations on titles are included as part of the topic statement. Due to limited translation resources, we have been unable to translate all of the topic statement (i.e. the narrative) into non-English. The topics are available here and more information about their format is available here. We expect that participants will focus on translating the titles and the narrative will remain largely unused (the descriptions of relevance are more necessary during the relevance assessments). However, participants are able to use the example image and caption as stated in the English narrative during retrieval.
The goal of the automatic task is to retrieve as many relevant images from the collection as possible given the multilingual topic titles. Participants are free to use whatever methods they want to retrieve relevant documents including content-based retrieval methods, and query expansion. Participants are also free to experiment with whatever methods they wish for CLIR. The task is to be fully automatic without any user interaction.
The document collection consists of around 30,000 images and captions. The captions consist of several fields containing semi-structured data and participants are free to match on any part of the caption for textual retrieval. We encourage participants who have the resources to translate the English narratives or the image captions into non-English to do so and to share these translations with other members of ImageCLEF.
For this task, participants are required to submit ranked lists of the top 1000 images with the highest similarity measures nearer the top of the list. Participants can submit (via email) as many system runs as they require, but should indicate their best run as we can only guarantee evaluation of this alone. Ranked lists from all participants will be pooled to create relevance sets and assessors from the University of Sheffield will make relevance judgements.
Ranked lists from participants for this ad hoc task will be evaluated using trec_eval by including recall and precision at various cut-off levels plus single-value summaries derived from precision and recall, i.e. mean average precision and R-precision. We will publish results in a manner similar to the way in which NIST publishes the results from TREC.
Task 2: interactive image retrieval
The goal of the interactive task is not to compare participants systems in a competitive environment, but rather for participants to explore variations of their retrieval system in two scenarios. The scenarios are detailed here, and participants are free to complete one or both tasks. The tasks can be used to compare two systems (or any other two variables in the system such as the type of translation method used) or to evaluate a single system. We recommend that at least 4 users are involved in testing the system. In both scenarios, user questionnaires are a recommended way of obtaining feedback from the user about their level of satisfaction with the system. An example questionnaire can be obtained, if required, by contacting Paul Clough.
In both scenarios, native speakers of languages other than English should be able to interact with your image retrieval system in a language of their choice (we suggest limiting it to one of the 5 European languages used in task 1: French, German, Italian, Spanish or Dutch). It is likely that users will want to browse through images in the collection and participants are encouraged to explore different interfaces, or ways of organising the collection to support these tasks. For example you may want to experiment with approaches for browsing versus searching, clustering images, caption similarity searches, relevance feedback and maybe even various methods of user input mechanisms, e.g. sketching the required image.
We suggest the scenarios can be used to address at least three aspects of cross language image retrieval which may affect the overall retrieval performance (but not necessarily be due to good or bad effectiveness of the retrieval system itself):
We encourage users to look at the iCLEF track guidelines for further advice on how to perform interactive retrieval experiments and no formal evaluation of this task will take place (except for the relevance assessments for the second scenario), rather participants are encouraged to discuss with others what they have learned from these scenarios.
The image collection: ESTA
The image collection for this task consists of 28,133 images crawled from the St Andrews University Library photographic collection and arranged onto 2 CDs which can be obtained from the contacts. Before receiving the data, you must fill in a CLEF agreement form available from Carol Peters. Once we are notified of your agreement, we will send you the CDs and include you on the ImageCLEF mailing list.
The submission guidelines can be found here. Submission is only required for task 1; the interactive task is not formally evaluated by us. Any papers written by yourselves and our evaluation of your retrieval results will be published in the CLEF proceedings. The deadline for submitting papers is 20th July.
Contacts for this track
Paul Clough, University of Sheffield (firstname.lastname@example.org)
Mark Sanderson, University of Sheffield (email@example.com)
We have set up a mailing list: firstname.lastname@example.org for participants. Please email the above contacts to be added to the list.
Results from ImageCLEF 2003
Four groups participated in ImageCLEF 2003 including the University of Surrey, Daedalus (Spain) and the NTU (China) and ourselves. A summary of the results can be found in our ImageCLEF overview paper (Clough,2003). For more information about participants entries, see the CLEF web site.
The relevance judgements (qrels) are now available to download for the ImageCLEF 2003 ad hoc task.
Clough, P.D. and Sanderson, M.(2003), The CLEF 2003 Cross Language Image Retrieval Track, In Submission, Cross Language Evaluation Forum (CLEF) 2003, Trondheim, Norway.