NewsREEL Multimedia: News recommendation with image/text content

Task description
Participants will explore ways to use images and text for better news recommendation. We have collected data from several news websites for four weeks. The goal of the task is to predict the number of times users access each news item on each day.

The data include news articles, images, and interactions with visitors. For the images, we will release the URL and activation of a hidden layer of ImageNet. Interactions include accessing articles as well as clicking on recommendations. The data is subject to a usage agreement with the data provider plista GmbH.

Participants will get all data for the first three weeks. The test data consists of the fourth week. Here, participants receive the content, and need to predict the interactions. For each combination of item and day, participants are required to predict the number of times that users access the item on that day.

Target group
We target researchers from academia and industry, especially those interested in machine learning, natural language processing, and/or feature extraction from images.

Data
The data cover a period of six weeks for a selection of five publishers. There are 51397 images related to articles during this period. These distribute unequally with one publisher accounting for 42003 images. In addition, we provide a total of 1691 unique labels assigned by seven automatic annotators trained on ImageNet. The data set is approximately 8.6GB size. We observe a total of about 142 million impressions, 206 million recommendations, and 790 thousand clicks.

Ground truth and evaluation
The task is evaluated according to the mean square error, where the error is the difference between the predicted number of view and the recorded number of views.

Recommended reading
Lommatzsch, Andreas, et al. “CLEF 2017 NewsREEL overview: A stream-based recommender task for evaluation and education.” International Conference of the Cross-Language Evaluation Forum for European Languages. Springer, Cham, 2017.

Corsini, Francesco, and M. A. Larson. “CLEF newsreel 2016: Image-based recommendation.” (2016).
http://ceur-ws.org/Vol-1609/16090618.pdf

Pazzani, Michael J., and Daniel Billsus. “Content-based recommendation systems.” The adaptive web. Springer, Berlin, Heidelberg, 2007. 325-341.

Task organizers
Andreas Lommatzsch, TU Berlin, Germany, andreas.lommatzsch at dai-labor.de
Benjamin Kille, TU Berlin, Germany benjamin.kille at dai-labor.de
Frank Hopfgartner, University Sheffield, UK

Task schedule
more information coming soon