MediaEval 2016

MediaEval 2016

The MediaEval 2016 workshop took place 20-21 October 2016 at the Netherlands Institute for Sound in Vision in Hilversum, Netherlands, right after ACM Multimedia 2016 in Amsterdam. The workshop brought together task participants to present and discuss their findings, and prepare for future work. A list of the tasks offered in MediaEval 2016 is below. Workshop materials and media are available online:

MediaEval 2016 Working Notes Proceedings http://ceur-ws.org/Vol-1739/
Summary of MediaEval 2016, which was published in IEEE Multimedia
MediaEval 2016 Workshop Program
Slides from the workshop presentations and posters from the poster sessions: http://www.slideshare.net/multimediaeval
The workshop video: MediaEval 2016 Workshop Video by Richard Sutcliffe
Playlist of the task overview presentations. MediaEval 2016 Task Overview Videos
Videos of other presentations (experimental this year): MediaEval Community YouTube Channel
For pictures, check out MediaEval on Flickr
MediaEval 2016 Acknowledgements

mediaeval_2016_grouppicture_web

Did you know?
Over its lifetime, MediaEval teamwork and collaboration has given rise to over 600 papers in the MediaEval workshop proceedings, but also at conferences and in journals. Check out the MediaEval bibliography.

New in 2016, were MediaEval Task Forces: groups of people working together to investigate the viability of an idea for a MediaEval 2017 task. (Scroll to the bottom of the list.)

MediaEval 2016: Task List
The following tasks were offered by MediaEval 2016. (See MediaEval 2016 Task Overview Videos for individual videos for each task):

Verifying Multimedia Use Task
Participants are required to design a system that verifies whether a tweet and its accompanying multimedia content item (image or video) truthfully reflect a real-world event. The system can make use of any available tweet characteristics and/or features derived from the multimedia content. The data used are posts shared during high impact events on social media. F1 score is the metric that it will be used for evaluation. Read more...

Emotional Impact of Movies Task
For this task, participating teams develop approaches automatically infer the emotional impact of movies. Specifically, induced valence and arousal scores must be predicted in two scenarios: (1) global prediction for short video excerpts, and (2) continuous prediction for longer videos. The training data consists of Creative Commons-licensed movies (professional and amateur) together with human annotations valence-arousal ratings. The results will be evaluated using standard evaluation metrics. Read more...

Querying Musical Scores with English Noun Phrases Task (C@MERATA)
The input is a natural language phrase referring to a musical feature (e.g., ‘consecutive fifths’) together with a classical music score and the required output is a list of passages in the score which contain that feature. Scores are in the MusicXML format which can capture most aspects of Western music notation. Evaluation is via versions of Precision and Recall relative to a Gold Standard produced by the organisers. Read more...

Predicting Media Interestingness Task (New!)
This task requires participants to automatically select frames or portions of movies which are the most interesting for a common viewer. To solve the task, participants can make use of the provided visual, audio and text content. System performance is to be evaluated using standard Mean Average Precision. Read more...

Zero Cost Speech Recognition Task (ex QUESST)
The goal of this task is to challenge teams to come up and experiment with bootstrapping techniques, which allow to train initial ASR system or speech tokenizers for “free”. With “free” we mean a technique that makes it possible to train a speech recognition system on public resource data without the need of buying (expensive, ideally any) datasets. This year’s language will be Vietnamese. Read more...

Placing Task
The Placing Task requires participants to estimate the locations where multimedia items (photos or videos) were captured solely by inspecting the content and metadata of these items, and optionally exploiting additional knowledge sources such as gazetteers. Read more...

Multimodal Person Discovery in Broadcast TV Task
Given raw TV broadcasts, each shot must be automatically tagged with the name(s) of people who can be both seen as well as heard in the shot. The list of people is not known a priori and their names must be discovered in an unsupervised way from provided text overlay or speech transcripts. The task will be evaluated on a composite multilingual corpus from INA (French), DW (German and English) and UPC (Catalan), using standard information retrieval metrics based on a posteriori collaborative annotation of the corpus by the participants themselves. Read more...

Retrieving Diverse Social Images Task
This task requires participants to refine a ranked list of Flickr photos retrieved with general purpose multi-topic queries using provided visual, textual and user credibility information. Results are evaluated with respect to their relevance to the query and the diverse representation of it. Read more...

Context of Multimedia Experience Task
This task develops multimodal techniques for automatic prediction of multimedia in a particular consumption content. In particular, we focus on the context of predicting movies that are suitable to watch on airplanes. Input to the prediction methods is movie trailers, and metadata from IMDb. Output is evaluated using the Weighted F1 score, with expert labels as ground truth. Read more...

MediaEval 2016: Task Forces
This year, MediaEval is working to facilitate a set of Task Forces. MediaEval Task Forces are groups of people who will work together towards developing a new MediaEval task to propose in 2017. Task Forces undertake a pilot and/or contribute a paper to the workshop proceedings. You can check the Task Force boxes when you register on the MediaEval 2016 registration page to indicate that you would be interested in participating in a task force. If you are interested in initiating a task force, please contact Martha Larson m (dot) a (dot) larson (at) tudelft.nl

Task Force: Scene change: Using user photos to detect changes in 3D scenes over time
This task force was initiated by a Dutch company interested in detecting changes between sets of pictures of a scene shot within the same time range on different days. The training data for this task would consist of user-contributed photos (smartphones and low end devices), their 3D positions and viewing angles and a point cloud per set. The challenges are identifying and exploiting the multimedia aspects of the task, developing the evaluation metrics, and allowing the task to scale.

Task Force: Sky and the Social Eye: Enriching Remote-Sensed Events
This task force is an ACM Multimedia 2016 Grand Challenge
This task is dedicated to linking social multimedia to events that can be detected in satellite images, such as floods, fires, land clearing, etc. Consider, for instance, linking an event such as a bushfire or a flood, to media reports, Instagram, Flickr, Twitter, web pages, Wikipedia, etc. Effectively, the task is to take a long history of Landsat images, identify change events, and link them to other sources of information at large. This task force was initiated by a research group in Australia. The task force may make use of the YFCC100m dataset and resources made available by the Multimedia Commons Initiative.

Task Force: Detecting Bias in Large Multimedia Datasets
This task force aims to develop a task that would allow us to understand possible sources of bias in the multimedia datasets that we use to develop algorithms. We understand “bias” to be ways in which a dataset differs from its assumed or desirable underlying distribution. Learning how to identify this bias will ultimately contribute to preventing algorithms taking on unintended tendencies (i.e., privileging certain ethnic or socioeconomic groups). The initial idea for the task is that two teams would play against each other. The first creates a dataset sampled with a known bias, and the second analyzes the dataset to attempt to reconstruct the bias. The task force will focus on the YFCC100m dataset and resources made available by the Multimedia Commons Initiative.

General Information about MediaEval

MediaEval was founded in 2008 as a track called "VideoCLEF" within the CLEF benchmark campaign. In 2010, it became an independent benchmark and in 2012 it ran for the first time as a fully "bottom-up benchmark", meaning that it is organized for the community, by the community, independently of a "parent" project. The MediaEval benchmarking season culminates with the MediaEval workshop. Participants come together at the workshop to present and discuss their results, build collaborations, and develop future task editions or entirely new tasks. Past working notes proceedings of the workshop include:

MediaEval 2012: http://ceur-ws.org/Vol-807
MediaEval 2013: http://ceur-ws.org/Vol-1043
MediaEval 2014: http://ceur-ws.org/Vol-1263
MediaEval 2015: http://ceur-ws.org/Vol-1436

MediaEval 2016 Community Council
Guillaume Gravier (IRISA, France)
Gareth Jones (Dublin City University, Ireland)
Bogdan Ionescu (Politehnica of Bucharest, Romania)
Martha Larson (Delft University of Technology, Netherlands) (Contact and Workshop General Chair)
Mohammad Soleymani (University of Geneva, Switzerland) (Workshop General Chair)

Acknowledgments
Key contributors to 2016 organization
Saskia Peters (Delft University of Technology, Netherlands)
Bogdan Boteanu (University Politehnica of Bucharest, Romania)
Gabi Constantin (University Politehnica of Bucharest, Romania)
Andrew Demetriou (Delft University of Technology, Netherlands)
Richard Sutcliffe (University of Essex, UK)

Sponsors and Supporters:
MediaEval 2016 thanks these organizations for their sponsorship and support:

beeld-en-geluid

IAPR
Technical Committee TC12 "Multimedia and Visual Information Systems"
of the International Association of Pattern Recognition

MediaEval Benchmarking Initiative for Multimedia Evaluation

The "multi" in multimedia: speech, audio, visual content, tags, users, context

Schedule and Information