Event Sync (New!)

The 2014 Event Synchronization Task: Synchronization of Multi-User Event Media (SEM)
Content creation is more and more a collective experience. People attending large social events (a soccer match, a concert), but also personal-scale ones (a wedding, a birthday party) collect dozens of photos and video clips with their smartphones, tablets, cameras, and more recently social cameras. Such information is later exchanged in a number of different ways, including shared repositories, clouds, social networks, etc.

Because of these practices, different media galleries are available to each other, making it possible for any user who attended or is simply interested to the event, to create his own view of it through summaries, stories, personalized albums. However, such a large amount of data turns out to be unstructured and heterogeneous and, even if it would be possible to collect it on the same hard drive, the variability in terms of content, naming, archiving strategies makes it impossible to organize the material in a simple yet effective manner.

For this reason, a major issue is the need of aligning and presenting the various collections in a consistent way. Particular challenges are presented by the fact that the time and location information attached to the captured media (timestamp, GPS) can be wrong, inaccurate or even missing (for instance, due to wrong setting of the clock/calendar, different time-zone, modification or removal of tags). Similar challenges are a common situation in historical events and photo archives, where timestamps and especially location information is rarely available. In some other cases, images might be processed offline for post-production, thus losing the correct temporal information. In such cases, creating a single timeline could turn out to be complicated with a concrete risk of representing the event in a misleading way.

In the scenario for this task, we imagine a number of users (10+) attending the same event and taking photos and videos with different non-synchronized devices (smartphones, cameras, tablets). Each user may take an arbitrary number of photos, possibly covering just a part of the event, with variable density of acquisitions (single photos). Part of these pictures will be made available with a complete annotation in terms of timestamp and GPS coordinates, while others will only be provided with partial annotations. We assume the consistency of each user collection, in the sense that when timestamp is available, it is coherent over the whole gallery, and the same applies to GPS.

The SEM task will be posed in this way: “Given N image collections (galleries) taken by different users/devices at the same event, find the best (relative) time alignment among them and detect the significant sub-events over the whole gallery”

Target group
Researchers in the area of event detection and analysis, multimedia indexing and retrieval.

Data
Two different datasets will be released, containing pictures arranged in galleries and describing large scale events extended over multiple days. A part of one of the two datasets will be provided for training.

The datasets use different sources and different events. Large events will be addressed, so as to ensure a wide availability of data crawled from social networks (large sports events such as world championships, big music concerts, political or social events, large-scale natural events, etc.). Selected media will have a suitable ground truth, and in particular all of them will be originally time-stamped.

Collection of data under Creative Commons license is preferred. Media tags will possibly be manipulated in order to include in the datasets a number of realistic problems (e.g., removing or modifying the timestamp, introducing an arbitrary delay, removing GPS, etc.). Such modifications will not affect the working assumptions. Events used for training and experiments may be different in type and instances from the ones used in the final challenge.

Ground truth and evaluation
Two objective metrics will be used to evaluate the results:
• time synchronization error
• sub-event detection error
As far as the first metric is concerned, the goal of the participants is to maximize the number of galleries for which the synchronization error is below a predefined threshold, and to minimize the time shift of those galleries.

Once the galleries are synchronized, it will be possible to cluster the whole set to detect sub-events within the whole story, for instance, the main actions in a football match. Sub-events will be defined in a neutral and unbiased way (e.g., making reference to the calendar/schedule of the event) and coded into the ground truth.
The second metric will then measure the performance of the sub-event clustering over the whole synchronized set of media.

In this case, we use a single performance indicator, namely the Rand Index (RI), which measures the accuracy of the clustering by taking into account the rate of correct decisions.

Recommended reading
[1] Blakowski, G., Steinmetz, R. A Media Synchronization Survey: Reference Model, Specification, and Case Studies. Selected Areas in Communications. IEEE Journal, 14(1), 1996, 5–35.

[2] Broilo, M., Boato, G., De Natale, F.G.B. Content-based Synchronization for Multiple Photos Galleries. In Proceedings of IEEE ICIP International Conference on Image Processing. Orlando, Florida, USA, 2012.

[3] Kim, G., Xing, E.P. Jointly Aligning and Segmenting Multiple Web Photo Streams for the Inference of Collective Photo Storylines. In Proceedings of IEEE CVPR Conference on Computer Vision and Pattern Recognition. IEEE, Portland, Oregon, USA, 2013, 620-627.

[4] Veenhuizen, A., Brandenburg, R. van Frame Accurate Media Synchronization of Heterogeneous Media Sources in an HBB Context. In Proceedings of the Media Synchronization Workshop. Berlin, Germany, 2012.

[5] Yang, J., Luo, J., Yu, J., Huang, T.S. Photo Stream Alignment and Summarization for Collaborative Photo Collection and Sharing. IEEE Transactions on Multimedia, 14(6), 2012, 1642–1651.

Task organizers
Francesco G. B. De Natale, University of Trento, Italy,
Vasileios Mezaris, ITI - CERTH, Greece
Nicola Conci, University of Trento, Italy

Task schedule
22 May: Development data release (Updated release date)
15 June: Test data release
5 September: Run submission due
19 September: Results returned
28 September: Working notes paper deadline

Note that this task is a "Brave New Task" and 2014 is the first year that it is running in MediaEval. If you sign up for this task, you will be asked to keep in particularly close touch with the task organizers concerning the task goals and the task timeline.

MediaEval Benchmarking Initiative for Multimedia Evaluation

The "multi" in multimedia: speech, audio, visual content, tags, users, context