Placing Task

The 2012 Placing Task
The task requires participants to assign geographical coordinates (latitude and longitude) to each provided test video.

If you are interested in using the data set, please contact Pascal Kelm, TU Berlin.

Code from past Placing Tasks as been made available on github:

https://github.com/PascalKelm/PlacingTask
https://github.com/ovlaere
https://github.com/xarabas/MediaEval2012-PlacingTask-IRISA

Please also see the Placing Task papers in the MediaEval 2012 working notes proceedings: http://ceur-ws.org/Vol-927/

From feedback from previous Placing Tasks, we have split the task into two sub-tasks: the General Placing sub-task and the City-Based sub-task.

The General Placing Sub-Task (with mandatory/optional runs) is the same as previous years and asks participants to accurately and precisely derive the location of a given video, potentially anywhere in the world using textual, visual and/or audio data (depending on the constraints of the run). For this sub-task, participants are provided with geotagged video and photo training data.
The City-based Sub-Task (optional) involves participants deriving very highly accurate and precise locations for a given video within a known city using only the visual and/or audio information available in the video. A list of chosen cities will be provided. For this sub-task, participants are not provided with training data (but possible external sources will be suggested).

Target group
The task is of interest to researchers in the area of geographic multimedia information retrieval, social media and media analysis.

Data
General Placing Sub-Task Data The data set is an extension of the 2010-2012 Placing Task data sets and contains a set of geotagged Flickr videos (~15,000 for development and ~4,300 test purposes) as well as the metadata for geotagged Flickr images (~3.2 million). A set of basic visual features extracted for all images and for the frames of the videos is provided. Evaluation of runs submitted by participating groups will be based on Great Circle distances between the predicted and the actual geo-coordinates encoded in the video. Ground truth is supplied by Flickr users uploaded the videos and the images. All videos and images are shared by their owners under Creative Commons licenses.

City-based Sub-Task Data Participants are encouraged to find their own source of training data for this task and could consider sources such as http://image.ntua.gr/iva/datasets/ . The test data will be YouTube videos will be crawled that have Creative Commons licenses and geographic coordinates for a number of world-wide cities. The videos will be manually checked to confirm that they contain information sufficient to place them.

Ground truth and evaluation
The geo-coordinates associated with the Flickr/YouTube video will be used as the ground truth. Since these do not always serve to precisely pinpoint the location of the video, we will evaluate at each of a series of widening circles: 10m, 100m, 1km 10km 100km 1,000km 10,000km. (The wider circles will be used in the General Placing Sub-Task, but not the City-Based Sub Task) Note: We are encouraging participants to be more accurate than in previous years by including the smaller error radius ranges.

Task schedule
18 May: Development set release
22 June: Test set release
25 August: Run submission deadline (extended)
31 August: Results of the task released (extended)

Recommended reading
Similar research towards automatic geotagging of images is recently described in:

Kelm, P., Schmiedeke, S. and Sikora, T. A Hierarchical, Multi-modal Approach for Placing Videos on the Map using Millions of Flickr Photographs ACM Multimedia Workshop on Social and Behavioral Networked Media Access - SBNMA, 2011.

Serdyukov, P., Murdock, V., and van Zwol, R. Placing flickr photos on a map. In SIGIR 2009.

Hays, J. and Efros, A. A. im2gps: estimating geographic information from a single image. In CVPR 2008.

The 2011 Placing Task overview paper

Task organizers:
Pascal Kelm, Technical University of Berlin, Germany
Adam Rae, Yahoo! Research, Spain

This task is organized by the EU FP7 projects Glocal: Event-Based Retrieval of Networked Media and VideoSense - Centre of Excellence

MediaEval Benchmarking Initiative for Multimedia Evaluation

The "multi" in multimedia: speech, audio, visual content, tags, users, context