Media Memorability

The 2019 Predicting Media Memorability Task

Task description
This task focuses on the problem of predicting how memorable a video is to viewers. It requires participants to automatically predict memorability scores for videos that reflect the probability a video will be remembered. Task participants are provided with an extensive dataset of videos that are accompanied by memorability annotations, as well as pre-extracted state-of-the-art visual features. The ground truth has been collected through recognition tests, and thus results from objective measurement of memory performance. Participants will be required to train computational models capable of inferring video memorability from visual content. Optionally, descriptive titles attached to the videos may be used. Models will be evaluated through standard evaluation metrics used in ranking tasks (Spearman’s rank correlation). The data set used in 2019, is the same as in 2018 (2018’s testset ground truth data has not been released). This year the task focuses on understanding the patterns in the data and improving the ability of algorithms to capture those patterns.

Task motivation and background
The motivation for the task is the growth over recent years in the amount of multimedia content shared and consumed via online platforms. Enhancing the usefulness of multimedia requires new ways to organize–in particular, to recommend and retrieve–digital content. Like other aspects related to video relevance, such as aesthetics or interestingness, memorability can be regarded as useful to help make a choice between competing videos. Effective memorability prediction models will also push forward the semantic understanding of multimedia content by putting human cognition and perception at the center of the multimedia analysis.

A number of different application areas benefit from deeper understanding of what makes some videos memorable, and others less so. These areas include recommendation and retrieval in online platforms, already mentioned, but also advertising, filmmaking, and education. It is increasingly important to understand what makes videos memorable in order to keep the use of automatic processing techniques evenhanded. Applications can make use of information about predicted memorability, but it is important to understand memorability well enough to be able to avoid systems that are hyper-optimized to viewer responses.

Target group
Researchers will find this task interesting if they work in the areas of human perception and the impact of multimedia on perception such as image and video interestingness, memorability, attractiveness, aesthetics prediction, event detection, multimedia affect and perceptual analysis, multimedia content analysis, machine learning (though not limited to).

Data
The dataset is composed of 10,000 (soundless) short videos extracted from raw footage used by professionals when creating content, and in particular, commercials. Each video consists of a coherent unit in terms of meaning and is associated with two scores of memorability that refer to its probability to be remembered after two different durations of memory retention.

The videos are shared under Creative Commons licenses that allow their redistribution. They come with a set of pre-extracted features, such as: Dense SIFT, HoG descriptors, LBP, GIST, Color Histogram, MFCC, Fc7 layer from AlexNet, C3D features, etc.

Ground truth and evaluation
Each video consists of a coherent unit in terms of meaning and is associated with two scores of memorability that refer to its probability to be remembered after two different durations of memory retention. Memorability has been measured using recognition tests, i.e., through an objective measure, a few minutes after the memorization of the videos, and then 24 to 72 hours later.

The outputs of the prediction models – i.e., the predicted memorability scores for the videos – will be compared with ground truth memorability scores using classic evaluation metrics (e.g., Spearman’s rank correlation).

Recommended reading
[1] Romain Cohendet, Claire-Hélène Demarty, Ngoc Q. K. Duong, Mats Sjöberg, Bogdan Ionescu, Thanh-Toan Do. 2018. MediaEval 2018: Predicting Media Memorability. In Working Notes Proceedings of the MediaEval 2018 Workshop. Sophia Antipolis, France, 29-31 October 2018.
[2] Aditya Khosla, Akhil S Raju, Antonio Torralba, and Aude Oliva. 2015. Understanding and predicting image memorability at a large scale. In Proc. IEEE Int. Conf. on Computer Vision (ICCV). 2390–2398.
[3] Phillip Isola, Jianxiong Xiao, Devi Parikh, Antonio Torralba, and Aude Oliva. 2014. What makes a photograph memorable? IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 7 (2014), 1469–1482.
[4] Hammad Squalli-Houssaini, Ngoc Duong, Marquant Gwenaëlle, and Claire-Hélène Demarty. 2018. Deep learning for predicting image memorability. In Proc. IEEE Int. Conf. on Audio, Speech and Language Processing (ICASSP).
[5] Junwei Han, Changyuan Chen, Ling Shao, Xintao Hu, Jungong Han, and Tianming Liu. 2015. Learning computational models of video memorability from fMRI brain imaging. IEEE transactions on cybernetics 45, 8 (2015), 1692–1703.
[6] Sumit Shekhar, Dhruv Singal, Harvineet Singh, Manav Kedia, and Akhil Shetty. 2017. Show and Recall: Learning What Makes Videos Memorable. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2730–2739.
[7] R. Cohendet, K. Yadati, N. Duong, C-H. Demarty. 2018. Annotating, understanding, and predicting long-term video memorability. In proc. ACM Int. Conf. on Multimedia Retrieval (ICMR).
[8] Romain Cohendet, Claire-Hélène Demarty, Ngoc Q. K. Duong, M. Engilberge. VideoMem: Constructing, Analyzing, Predicting Short-term and Long-term Video Memorability. 2018. Arxiv:1812.01973

Please note that a dataset for predicting long-term video memorability is publicly available with [7] at the following address:
https://www.technicolor.com/dream/research-innovation/movie-memorability-dataset

Task organizers
Mihai Gabriel Constantin, University Politehnica of Bucharest, Romania
Bogdan Ionescu, University Politehnica of Bucharest, Romania
Claire-Hélène Demarty, Technicolor, France
Quang-Khanh-Ngoc Duong, Technicolor, France
Xavier Alameda-Pineda, INRIA, France
Mats Sjöberg, CSC, Finland

Task auxiliaries
Liviu-Daniel Ştefan, University Politehnica of Bucharest, Romania
Ricardo Savii, Federal University of São Paulo, Brazil

Task schedule
Development data release: 1 May 2019
Test data release: 03 June 2019
Runs due: 20 September 2019
Working Notes paper due: 30 September 2019
MediaEval 2019 Workshop (in France, near Nice): 27-29 October 2019

MediaEval Benchmarking Initiative for Multimedia Evaluation

The "multi" in multimedia: speech, audio, visual content, tags, users, context