Viewer Context

The 2016 Context of Experience Task: Recommending Videos Suiting a Watching Situation (New!)
This task tackles the challenge of predicting the multimedia content that users find most fitting to watch in specific viewing situations. Most work on video recommendation focuses on predicting personal preferences. As such, it overlooks cases in which context has a strong impact on preference relatively independently of the personal tastes of specific viewers. Particularly strong influence of context can be expected in unusual, potentially psychologically or physically straining, situations.

In this task, we focus on the case of viewers watching movies on an airplane. Here, independently of personal preferences, viewers share the common goal (which we consider to be a “viewing intent”) of passing the time, and keeping themselves occupied in the small space of an airplane cabin. The objective of the task is to predict which videos allow viewers to achieve this goal, given the context, which includes the limitations of the technology (e.g., screen size), and the environment (e.g., background noise, interruptions, presence of strangers). We choose airplanes since the role of stress, and viewers’ intent to distract themselves is widely acknowledged, e.g., in online descriptions such as [1]. Although this year will limit itself to the airplane scenario, we note that the challenge of Context of Experience is much broader in scope. Other stressful contexts where videos are becoming increasingly important include hospital waiting rooms, and dentists offices, where videos are shown during treatment.

The task will provide participants with a list of movies (including links to descriptions and video trailers), and require them to classify each movie into +goodonairplane/-goodonairplane classes. The ground truth of the task is derived from two sources. First, actual movie lists used by a major airlines, and second user judgments on movies that are collected via a crowdsourcing tool.

Task participants should form their own hypothesis about what is important for users viewing movies on an airplane, and design and approach using appropriate features and a classifier, or decision function. Figure 1 gives an impression of such a screen and the very specific attributes regarding size and quality of the video.

Movie on a plane

Figure 1: A set of conditions, including small screen and confined, crowded space, characterize the context of watching a movie on an airplane.

The following video was made to provide a more detailed impression of viewing conditions, and to encourage people to reflect on the characteristics that make certain movies suitable for watching on an airplane.

The value of the task is in understanding the ability of content-based and metadata-based features to discriminate the kind of movies that people would like to watch on small screens under stressful situations. The task is closely related to work in the area of Quality of Multimedia Experience and producer/consumer intent [2-5].

Task participants will be provided with a collection of videos (we will provide trailers as representative for the movie + context, e.g. video URL in different qualities + metadata + user votes) and will need to develop methods that will predict to which intent class the video belongs.

Target group
The task is attractive to researchers with a wide range of interests since it can be addressed by leveraging techniques from multiple multimedia-related disciplines including social computing (intent), machine learning (classification), multimedia content analysis, multimodal fusion, and crowdsourcing. It is also a practical and attractive topic from a content provider's point of view, since the exploitation of intent in combination with for example users satisfaction could lead to sophisticated ways to provide a better service to the users.

Data
We will release a data set including titles and links that allow participants to gather online metadata and trailers for movies. We will not provide the video files. The data set will include around 500 movies. Examples will be collected in part based on movies lists from a major international airline. This video dataset will contain both positive and negative examples, carefully sampled in order to create a fair and representative negative class. The data set will be split into a training and test set. To collect user judgements, we will use existing system that has been built for the purpose of collecting user feedback of this sort. We will evaluate systems both with respect to the airline’s choice of movies, and the crowd’s choice of airline-suitable movies. The crowd’s choice will be considered the authoritative labels.

Ground truth and evaluation
Overall, we are interested in measuring the accuracy with which an automatic method can distinguish between different intent categories. Hence, given a set of labeled instances (videos + context + label) that could be used for training, the participants should predict the labels of the test cases. As a first proposal, the classical measures of P-R and WF1-score could be used to quantify performance.

Recommended reading
[1] http://www.tripinsurance.com/tips/guide-to-the-best-moviestv-shows-to-watch-on-a-plane, December 2014.

[2] Michael Riegler, Martha Larson, Concetto Spampinato, Jonas Markussen, Pål Halvorsen, Carsten Griwodz/ Introduction to a Task on Context of Experience: Recommending Videos Suiting a Watching Situation. MediaEval 2015 Proceedings.

[3] Zhu, Y., Heynderickx, I., & Redi, J. A. (2015). Understanding the role of social context and user factors in video Quality of Experience. Computers in Human Behavior, 49, 412-426.

Task organizers
Michael Riegler, Simula Research Laboratory and University of Oslo, michael (at) simula.no, Norway
Concetto Spampinato, University of Catania, Italy

Task auxiliaries
Minoo Kargar, Simula Research Laboratory and University of Oslo, Norway
Martha Larson, Delft University of Technology and Radboud University Nijmegen, Netherlands

Task schedule
1 May 2016: Data (development and test) released.
15 Sept. 2016: Deadline for run submission.
20 Sept. 2016: Results returned.
30 Sept. 2016: Working notes paper deadline
20-21 Oct. 2016: MediaEval 2016 Workshop, Right after ACM MM 2016 in Amsterdam.

Acknowledgments
NFR FRINATEK Project EONS
And also
CrowdRec EC FP7 No. 610594

MediaEval Benchmarking Initiative for Multimedia Evaluation

The "multi" in multimedia: speech, audio, visual content, tags, users, context