Diverse Images

The 2016 Retrieving Diverse Social Images Task
Register to participate in this challenge on the MediaEval 2016 registration site.

This task addresses the problem of image search result diversification in the context of social media.

This year we address the use case of a general ad-hoc image retrieval system, which provides the user with diverse representations of the queries (see for instance Google Image Search). The system should be able to tackle complex and general-purpose multi-concept queries.

Participants are required, given a ranked list of query-related photos retrieved from Flickr, to refine the results by providing a set of images that are at the same time relevant to the query and to provide a diversified summary of it. Initial results are typically noisy and often redundant. The refinement and diversification process will be based on the social metadata associated with the images, on their visual characteristics, on information related to user tagging credibility (an estimation of the global quality of tag-image content relationships for a user’s contributions) or external resources (e.g., Internet).

This task is a follow-up of the 2013, 2014, and 2015 editions. As a novelty this year are the multi-concept, ad-hoc, queries scenario.

Target group
Target communities involve both machine- and human-based media analysis such as image retrieval (text, vision, multimedia communities), re-ranking, relevance feedback, crowd-sourcing, and automatic geo-tagging.

Data
The dataset consists of redistributable Creative Commons licensed information about general-purpose multi-topic queries. Each query will be represented with up to 300 Flickr photos and their associated social metadata (e.g., title, description, geo-tagging information, the number of times the photo has been displayed, number of posted comments, etc). The data are partitioned as following: (1) development data intended for designing and training the approaches (ca. 100 multi-concept queries with 30,000 images); (2) credibility data intended for estimating user tagging credibility descriptors (metadata for 685 different users); (3) evaluation data intended for the actual benchmarking (ca. 70 multi-concept queries with 21.000 images).

To encourage participation of groups from different communities, e.g., text, vision, multimedia, resources such as general-purpose visual descriptors and text models will be provided for the entire collection.

Ground truth and evaluation
All the images are to be annotated in terms of relevance to the query and diversity. Annotations are to be carried out by expert annotators. Relevance annotation will consist of yes/no annotations (including the “don’t know” option). Input from different annotators is aggregated with majority voting schemes. Diversity annotation will mainly consist of regrouping visually similar images into clusters (up to 25 clusters per query). Each image cluster is provided with a short text description that justifies its choice. Naturally, only relevant images are annotated for diversity.

System performance is to be assessed in terms of Cluster Recall at X (CR@X) — a measure that assesses how many different clusters from the ground truth are represented among the top X results (only relevant images are considered), Precision at X (P@X) — measures the number of relevant photos among the top X results and F1-measure at X is defined as the harmonic mean of the previous two. Various cutoff points are to be considered, e.g., X=5,10, 20, 30, 40, 50.

Recommended reading
[1] Working notes of the 2015 MediaEval Retrieving Diverse Social Images task, CEUR-WS.org, Vol. 1436, ISSN: 1613-0073.
[2] Working notes of the 2014 MediaEval Retrieving Diverse Social Images task, CEUR-WS.org, Vol. 1263, ISSN: 1613-0073.
[3] Ionescu, B., Gînscă, A.L., Boteanu, B., Lupu, M., Popescu, A., Müller, H. Div150Multi: A Social Image Retrieval Result Diversification Dataset with Multi-topic Queries. In Proceedings of ACM MMSys. Klagenfurt, Austria, 2016.
[4] Ionescu, B., Popescu, A., Lupu, M., Gînscă, A.L., Boteanu, B., Müller, H., Div150Cred: A Social Image Retrieval Result Diversification with User Tagging Credibility Dataset. In Proceedings of ACM MMSys. Portland, Oregon, USA, 2015.
[5] Ionescu, B., Radu, A.-L., Menéndez, M., Müller, H., Popescu, A., Loni, B. Div400: A Social Image Retrieval Result Diversification Dataset. In Proceedings of ACM MMSys. Singapore, 2014.
[6] Gînscă, A.L., Popescu, A., Ionescu, B., Armagan, A., Kanellos, I. Toward Estimating User Tagging Credibility for Social Image Retrieval. In Proceedings of ACM Multimedia. Orlando, Florida, USA, 2014.

Task organizers
Bogdan Ionescu, LAPI, University "Politehnica" of Bucharest, Romania (contact person),
Alexandru Lucian Gînscă, CEA LIST, France
Maia Zaharieva, University of Vienna and Vienna University of Technology, Austria,
Mihai Lupu, Vienna University of Technology, Austria,
Henning Müller, University of Applied Sciences Western Switzerland in Sierre, Switzerland.

Task auxiliaries
Adrian Popescu, CEA LIST, France,
Bogdan Boteanu, LAPI, University Politehnica of Bucharest, Romania,

Task schedule
31 May: Development and test data release;
16 September: Run submission due;
19 September: Results returned;
30 September: Camera ready working notes paper deadline;
20-21 October: MediaEval 2016 Workshop, right after ACM MM 2016 in Amsterdam

This task is supported by Vienna Science and Technology Fund (WWTF) through project ICT12-010.

We also acknowledge the precious help of the task supporters (by alphabetic order): Gabi Constantin, Lukas Diem, Ivan Eggel, Laura Fluerătoru, Ciprian Ionașcu, Corina Macovei, Cătălin Mitrea, Irina Emilia Nicolae, Mihai Gabriel Petrescu, Andrei Purică.

MediaEval Benchmarking Initiative for Multimedia Evaluation

The "multi" in multimedia: speech, audio, visual content, tags, users, context