Diverse Images (New!)

Announcement of Data Release
The task has concluded at the data has been released. Please see MediaEval Datasets.

The 2013 Retrieving Diverse Social Images Task
The diversity task is a new task for 2013. The task addresses the problem of result diversification in social photo retrieval.

We consider a tourist use case where a person tries to find more information about a place she is potentially visiting. The person has only a vague idea about the location, knowing the name of the place. She uses the name to learn additional information about the monument from the Internet, for instance by visiting a Wikipedia page, e.g., getting a photo, the geographical position of the place and basic descriptions. Before deciding whether this location suits her needs, the person is interested in getting a more complete visual description of the place.

In this task, participants are provided with a ranked list of photos of a location retrieved from common social photo search engines. These results are typically noisy and redundant. The task requires participants to exploit the provided visual and textual information to refine these results by selecting only a sub-set of photos.

The selected photos should be equally representative---accurate matches of the query location, diverse---depicting the query location in a complete manner (e.g., different views of a monument at different times of the day/year and under different weather conditions, drawings, sketches) and compact--select only very few results (e.g., 10-20 photos per query that fit a typical page of search results).

Target group
Researchers in the areas of image retrieval, re-ranking, relevance feedback, crowd-sourcing and automatic geo-tagging (but not limited to).

Data
The data set comprises ca. 400 locations, ranging from very famous ones (e.g., Colosseum of Rome) to lesser known to the general public (e.g., Palazzo delle Albere). For each location, we provide a ranked list of photos of various qualities retrieved using the name or GPS coordinates of the location from social media platforms (e.g., Flickr; ca. 100-150 results/location). To serve as reference information, each location is accompanied by a representative photo and a location description from Wikipedia. To encourage participation of groups from different research areas, additional resources such as state-of-the-art visual descriptors and textual location models will be provided for the entire collection. To respond to the task, participants are free to consider using other external data sources, such as Internet resources. In total, the data set will consist of around 44,000 photos. The data set is to be divided into a development set (e.g., around 50 locations to be used for training/tuning the methods) and a test set (for final evaluation). The data set will be restricted to contain only Creative Commons (CC) photos.

Ground truth and evaluation
Expert ground truth is going to be generated by a small group of human annotators with advanced knowledge of monument and location characteristics. The ground truth will mainly consist of determining which photos are relevant and then re-grouping them into similar appearance clusters. In addition, we also consider to generate crowdsourced ground truth (at least for a part of the test set) using specialized platforms (e.g., Crowdflower, Amazon Mechanical Turk).

Performance will be assessed for both representativeness and diversity. Evaluation metrics include average precision (measure of how many of the selected photos are relevant), completeness (computed as a recall measure of the total number of clusters, i.e., the groups of photos depicting the monument in a very similar way---that can be obtained from the true positive selected photos), cluster precision (measure of how many of the clusters in the final refinement are relevant), cluster recall (measure of how many of the existing clusters are represented in the final refinement) and F-measures (with different weightings).

Recommended reading
[1] A.-L. Radu, J. Stöttinger, B. Ionescu, M. Menéndez, F. Giunchiglia, “Representativeness and Diversity in Photos via Crowd-Sourced Media Analysis”, 10th International Workshop on Adaptive Multimedia Retrieval - AMR, Copenhagen, Denmark, 2012.
[2] S. Rudinac, A. Hanjalic, M. Larson, “Finding Representative and Diverse Community Contributed Images to Create Visual Summaries of Geographic Areas”, ACM Multimedia, pp. 1109-1112, 2011.
[3] Y. Avrithis, Y. Kalantidis, G. Tolias, E. Spyrou, “Retrieving Landmark and Non-Landmark Images from Community Photo Collections”, ACM Multimedia, pp. 153-162, 2010.
[4] B. Taneva, M. Kacimi, G. Weikum, “Gathering and Ranking Photos of Named Entities with High Precision, High Recall, and Diversity”, WSDM '10, Proceedings of the third ACM international conference on Web search and data mining, pp. 431-440, 2010.
[5] S. Nowak, S. Ruger, „How Reliable are Annotations via Crowdsourcing? A Study About Inter-Annotator Agreement for Multi-Label Image Annotation”, Int. Conf. on Multimedia Information Retrieval, page 557, 2010.
[6] R.H. van Leuken, L. Garcia, X. Olivares, R. van Zwol, “Visual Diversification of Image Search Results”, ACM International Conference on World Wide Web, pp. 341-350, 2009.
[7] T. Deselaers, T. Gass, P. Dreuw, H. Ney, “Jointly Optimising Relevance and Diversity in Image Retrieval”, ACM Conference on Image and Video Retrieval, pp. 1-8, 2009.

Task organizers
Bogdan Ionescu, LAPI, University Politehnica of Bucharest, Romania
María Menéndez, KnowDive, Department of Information Engineering and Computer Science, University of Trento, Italy
Henning Müller, University of Applied Sciences Western Switzerland (HES-SO) in Sierre, Switzerland
Adrian Popescu, CEA LIST, France

Task auxiliaries
Anca-Livia Radu, LAPI, University Politehnica of Bucharest, Romania & KnowDive, Department of Information Engineering and Computer Science, University of Trento, Italy,
Ionuț Mironică, LAPI, University Politehnica of Bucharest, Romania.
Bogdan Boteanu, LAPI, University, Politehnica of Bucharest, Romania;

Many thanks to the task supporters for their precious help: Ivan Eggel, Sajan Raj Ojha, Oana Pleș, Ionuț Duță, Andrei Purica, Macovei Corina and Irina Nicolae.

Note that this task is a "Brave New Task" and 2013 is the first year that it is running in MediaEval. If you sign up for this task, you will be asked to keep in particularly close touch with the task organizers concerning the task goals and the task timeline.

Task schedule
1 May: Development data release
3 June: Test data release
9 September: Run submission due
16 September: Results returned
28 September: Working notes paper deadline

This task is made possible by a collaboration of projects including:

EXCEL POSDRU/89/1.5/S/62557

logo2

MediaEval Benchmarking Initiative for Multimedia Evaluation

The "multi" in multimedia: speech, audio, visual content, tags, users, context