Video Data Set for 2012 Tagging Task
We are currently working on releasing the Creative Commons data set used for the 2012 Tagging Task. The set will be publicly available, with the exception of the ASR transcripts, which will need to be licensed separately. Please contact Martha Larson m.a.larson (at) tudelft (dot) nl for information.
The 2012 Tagging Task
The task requires participants to automatically assign tags to Internet videos using features derived from speech, audio, visual content or associated textual or social information. This year we will again focus on tags related to the genre of the video. Genre is understood as related to common browsing categories used for Internet video sharing websites, in particular, by blip.tv.
Target group
Researchers in the area of multimedia retrieval, spoken content search and social media.
Data
The data set will use an extended version of the blip.tv data used by the 2010 Wild Wild Web Task and the 2011 Genre Tagging task. The set is predominantly English with non-English content divided over a range of languages including French, Spanish and Dutch. All videos are shared by their owners under Creative Commons license. The data set is expected to contain over 10,000 videos amounting to over 2,000 hours of material.
Participants are provided with a video file for each episode along with metadata (e.g., title + description), comments, shot boundary information, one keyframe per shot and speech recognition transcripts, where applicable.
The tags for this year's task will be: Technology, Music and Entertainment, Politics, Educational, Videoblogging, Religion, Movies and Television, Sports, Art, Comedy, Gaming, Citizen Journalism, Documentary, The Mainstream Media, Business, Food & Drink, Health, Conferences and Other Events, Literature, The Environment, Personal or Auto-biographical, School and Education, Travel, Web Development and Sites, Science, Autos and Vehicles, Friends, Default Category
Ground truth and evaluation
The ground truth will be genre-related tags assigned by users to their videos. We will carry out a manual process for normalizing and de-noising the tags. The official evaluation metric will be Mean Average Precision (MAP). Average precision should also be reported individually for each tag.
Task schedule
May 29nd: Development data release.
June 25th: Test data release.
August 13th: Run submission deadline.
August 27th: Release of results.
September 17th: Working notes papers due.
Recommended reading
Brezeale, D. and Cook, D.J., 2008. Automatic Video Classification: A Survey of the Literature, Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on , vol.38, no.3, pp.416-430, May 2008.
Junyong Y., Guizhong L., and Andrew P. 2010. A semantic framework for video genre classification and event analysis. Image Commun. 25, 4 (April 2010), 287-302.
Larson, M., Eskevich, M., Ordelman, R., Kofler, C., Schmiedeke, S., Jones, G. J. F. 2011. Overview of MediaEval 2011 rich speech retrieval task and genre tagging task. Working Notes Proceedings of the MediaEval 2011 Workshop.
Larson, M., Soleymani, M., Serdyukov, P., Rudinac, S., Wartena, C., Murdock, V., Friedland, G., Ordelman, R. and Jones, G.J.F. 2011. Automatic Tagging and Geotagging in Video Collections and Communities. ACM International Conference on Multimedia Retrieval (ICMR 2011), pp. 51:1-51:8.
Task organizers:
Christoph Kofler, Delft University of Technology, Netherlands
Sebastian Schmiedeke, Technical University of Berlin, Germany
Isabelle Ferrané, University of Toulouse, France
This task is made possible by a collaboration of projects including: