The VideoCLEF track, introduced in 2008, aims to develop and evaluate tasks related to analysis of and access to multilingual multimedia content. In its first year, VideoCLEF piloted the Vid2RSS task, which involved the classification of Dutch-language documentaries having embedded English content arising from interviews and discussions with non-Dutch speakers.
Task participants were supplied with Dutch archival metadata, Dutch speech transcripts, English speech transcripts and 10 thematic category labels, which they were required to assign to the test set videos. Participants collected their own training data. Results were delivered in the form of a series of RSS-feeds, one for each category. Feed generation, intended to promote visualization, involved simple concatenation of existing feed items (title, description, keyframe). In addition to the main classification task, which was mandatory, VideoCLEF offered two discretionary tasks. The first was a translation task, requiring translation of the topic-based feeds from Dutch into a target language. The second was a keyframe extraction task, requiring selection of a semantically appropriate keyframe to represent the video from among a set of keyframes (one per shot) supplied with the test data.
Five groups participated in the 2008 VideoCLEF track. The best runs produced f-scores higher than 0.50, although no group broke 0.60. A favorite strategy was to approach the task as a classification problem, collecting data from Wikipedia or using a general search engine to train classifiers (SVM, Naive Bayes and k-NN were used). A competitive approach was to treat the problem as an information retrieval task, with the class label as the query and the test set as the corpus. Both the Dutch speech transcripts and the archival metadata performed well as sources of indexing features, but no group succeeded in exploiting combinations of feature sources to significantly enhance performance. The translation task had one participant only, who translated the feeds to English. A small scale fluency/adequacy evaluation revealed the translation to be of sufficient quality to make it valuable to a non-Dutch speaking English speaker. The keyframe extraction test was performed also by only one participant, who deployed the strategy of selecting the keyframe from the shot with the most representative speech transcript content. The automatically selected shots were shown with a small user study to be competitive with manually selected shots. Future years of VideoCLEF will aim to expand the corpus and the class label list as well as to extend the track to additional tasks.
Example of Dual Language Video The interest of Dutch television content to non-Dutch speakers is witnessed by examples of dual language content such as the following English/Dutch excerpt of the popular Dutch television documentary "In Europa" that has found its way onto YouTube:
(Note that the lack of Dutch sub-titles is somewhat unusual in this clip.)
VideoCLEF at CLEF 2008
The VideoCLEF Session of the CLEF 2008 workshop (17-19 September 2008 in Aarhus, Denmark) included the following presentations:
“VideoCLEF 2008: ASR Classification based on Wikipedia Categories”
Jens Kursten, Chemnitz U.Technology , Germany
“DCU and U.Amsterdam at VideoCLEF”
Martha Larson , U.Amsterdam, The Netherlands