--------------------------------------------------------
Notes on ImageCLEF topics for the ad hoc retrieval task
--------------------------------------------------------
Paul Clough, April 2003.
Introduction
-------------
This file contains a description of the format used to encode topics
for the ImageCLEF 2003 ad hoc retrieval task. The 50 topics consist
of an English version containing a title and narrative, and translations
of the titles of these topics into German, French, Italian, Spanish and
Dutch, together with possible linguistic variations. Literal translations
of each topic were performed by translators native to the language into
which the English titles were translated.
In some cases, more than one translator was available for a language and
their translations were merged with the results from other translators.
Multiple translations are due to either differences between translators, or
due to more than one possible literal translation for each topic title. In
cases of multiple translations of the same topic for the same language, the
first is the most suitable translation as judged by the first translator, the
rest are in no particular order.
The English topic consists of a title (a short query of typically 2-3 words),
and a narrative. The narrative is a longer description of what constitutes
a relevant image offering a more specific description of relevance than
the title. Translations of the topic include only the title due to limitations
of the time and effort available from our translators.
Topics were carefully chosen to represent "typical" searches that one might
expect against the target collection: the Eurovision St Andrews photographic
collection (or ESTA).
Topic subject matter was derived from:
(1) St Andrews University library Web logs for this collection.
(2) Subject categories as used in St Andrews photographic collection.
(3) An initial study done by an MSc student at Sheffield University.
The topics are as far as possible representative of:
(1) Real queries as found in Web logs.
(2) The length of typical queries.
(3) Queries which will cause problems in translation. For example:
a. Proper names.
b. Verb and noun phrases.
c. Source and target word ambiguity.
d. Compound nouns and verbs.
e. Word inflections e.g. plurals and gender.
(4) Types of requests: find pictures of specific objects, more general
concepts, or describing an action.
Topic format
-------------
We have tried to be consistent with the existing CLEF topic encoding scheme to
enable the use of existing CLEF topic parsers. However, we have had to adapt the
format slightly to enable us to encode further information into the existing
scheme.
The format of an example English topic is as follows: