The 2018 Recommending Movies Using Content: Which content is key?

Task description
The goal of the task is to use content-based features to predict how a movie is received by its viewers. Task participants must create an automatic system that can predict the average ratings that users assign to movies (representing the global appreciation of the movie by the audience) and also the rating variance (representing the agreement/disagreements between user ratings). The input to the system is a set of audio, visual and text features derived from trailers and selected movie scenes (movie clips).

The task addresses the question of which kinds of content are most helpful for predicting the reception that a movie will receive by its audience, as reflected in its ratings. There are two aspects to this question: (1) which part of the movie or trailer are most important (e.g., type of scene, beginning middle end) and (2) which aspects of the content are important (e.g., what is depicted, how it is edited). Because trailers and movie clips are different, we expect that it will be most productive to take their differences into account in this task. For example, movie clips are made usually with a few long shots focusing on a particular scene, while trailers use many short-length shots summarizing the entire movie.

An important challenge of the task is addressing the fact that user ratings on movies are atomic (i.e., users assign them to the movie as a whole), and it is not clear in how far we can assume that different parts of the movie or trailer contribute compositionally to the rating. This task explores the idea that it is productive to look for short segments that are predictive of the rating, and that it is not necessary to process the full-length movie for successful rating prediction. The advantages of a system that uses short segments are twofold: first of all, short segments allow for a dramatic reduction in computational time, and, second, short segments are more readily available than full movies.

The overall goal of this task is to gain insight into how multimedia content can be best used to better understand which parts of the movies are the key for driving users’ opinions, so they can be leveraged for successful movie recommendation. In addition, these insights can also be used by movie streaming companies or producers to improve their understanding of how movies can be summarized or in which ways the movies can be made to result in a positive user feeling.

Target group
Researchers will find this task interesting if they work in the research areas of multimedia processing, personalization and recommender system, machine learning and information retrieval.

Data
Participants are supplied with audio, visual and text features computed from trailers and clips corresponding to about 800 unique movies from the well-known MovieLens 20M dataset. The task makes use of the user ratings and tags (keywords) from the MovieLens dataset. Links to the clips and trailers (mainly on YouTube) are also provided.

Ground truth and evaluation
The representativeness of trailers/clips with respect to movies is realized by placing the user at the center of the evaluation and predicting users’ global ratings for which we use standard error metrics e.g., RMSE. We will also look for quantitative insights that answer the question `Which content is key?' in the task title.

Recommended reading
[1] Yashar Deldjoo, Mihai Gabriel Constantin, Markus Schedl, Bogdan Ionescu, Paolo Cremonesi. MMTF-14K: A Multifaceted Movie Trailer Feature Dataset for Recommendation and Retrieval, Proceedings of the 9th ACM Multimedia Systems Conference, 2018.

[2] Yashar Deldjoo, Mehdi Elahi, Paolo Cremonesi, Franca Garzotto, Pietro Piazzolla, Massimo Quadrana. Content-based Video Recommendation System based on Stylistic Visual Features, Journal on Data Semantics, 5(2), pp. 99-113, 2016.

[3] Yimin Hou, Ting Xiao, Shu Zhang, Xi Jiang, Xiang Li, Xintao Hu, Junwei Han, Lei Guo, L. Stephen Miller, Richard Neupert, Tianming Liu. Predicting Movie Trailer Viewer's “like/dislike” via Learned Shot Editing Patterns, IEEE Transactions on Affective Computing, 7(1), pp. 29-44, 2016.

[4] Robert Marich. Marketing to Moviegoers: A Handbook of Strategies and Tactics, SIU Press, 2013.

Task organizers
Yashar Deldjoo, Politecnico di Milano, Italy
Thanasis Dritsas, TU Delft, Netherlands
Mihai Gabriel Constantin, University Politehnica of Bucharest, Romania
Bogdan Ionescu, University Politehnica of Bucharest, Romania
Markus Schedl, Johannes Kepler University Linz, Austria

Task schedule
Development data release: 20 July 2018 (updated deadline)
Test data release: Shortly after development data release
Runs due: 25 September 2018
Working Notes paper due: 17 October 2018