C@merata: The 2014 Question Answering on Classical Music Scores Task (New!)
When studying musicological analyses of works of western classical art music there are frequent references to relevant passages in the printed score. Indeed, musicologists can refer to very complex aspects of a score within a text. Other experts know what passages they are talking about because they can interpret musical terminology in an appropriate fashion and can then look through scores to find the passages in question. However, this can be time consuming. So the long-term aim here is to develop tools which could facilitate the work of musicologists.
In the C@merata task, there will be a series of questions with required answers. Each question will consist of a short noun phrase in English referring to musical features in a score and a short classical music score in MusicXML. The required answer will consist of zero or more passages occurring in the score which contain the musical features specified in the question. A passage consists of a start point and an end point in the score associated with the question.
For full details please refer to http://csee.essex.ac.uk/camerata.
To participate, you will need to become familiar with natural language processing techniques and with methods of processing musical score data. You will also need to have a good technical knowledge of western classical music theory.
There will be 200 questions along the lines of the following example:
Work: J. Dowland, King of Denmark's Galliard, P 40
Q: perfect cadence
Q: four consecutive quavers / four consecutive quarter notes A: [3/4,2,2:3-2:6]
Q: dotted minim in the bass / dotted half note in the bass
Q: harmonic fourth
A: [3/4,1,1:1-1:1], [3/4,1,1:2-1:3], [3/4,1,4:1-4:3]
The repertoire of the music will consist of works from the Renaissance and Baroque periods by composers such as Dowland, Bach, Handel, Scarlatti etc. The music will vary in complexity. All works will be presented in standard Western classical staff notation.
There are further examples plus a detailed description of the task at the C@merata website: http://csee.essex.ac.uk/camerata.
Ground truth and evaluation
Evaluation of results will be carried out in various ways and at various levels.
Considering a particular passage it can be considered bar-correct if it starts in the bar where the requested feature starts and ends in the bar where the requested feature ends. In many practical cases where a person is searching for a feature for research purposes, this is more than sufficient as the feature can be picked out in an instant by an expert if the correct bar or range of bars is identified.
We will also consider a more stringent measure where a passage is considered beat-correct if it starts at exactly the correct beat in the start bar and also ends at the correct beat in the end bar. Beat-correct answers could be useful for applications of results which are themselves automatic.
We provisionally define Strict Precision as the number of beat-correct passages returned by a system divided by the number of passages (correct or incorrect) returned.
Similarly, Strict Recall is the number of beat-correct passages returned by a system divided by the total number of answer passages known to exist.
Lenient measures can use bar-correct instead of beat-correct.
We will prepare gold standard data using multiple annotators so that inter-annotator agreement can be checked.
 Cook, N. Towards the Compleat Musicologist. Keynote address. In Proceedings of ISMIR International Society for Music Information Retrieval. London, U.K., 2005.
 Howard, A. Purcell and the Poetics of Artifice: Compositional Strategies in the Fantasias and Sonatas. Ph.D. Dissertation, King’s College. London, U.K., 2006.
 Sutcliffe, R., Peñas, A., Hovy, E., Forner, P., Rodrigo, A., Forascu, C., Benajiba, Y., Osenova, P. Overview of QA4MRE Main Task at CLEF 2013. Proceedings of CLEF Evaluation Labs and Workshop, QA4MRE Question Answering for Machine Reading Evaluation. Valencia, Spain, 2013.
Richard Sutcliffe, University of Essex, UK
Tim Crawford, Goldsmiths College, University of London
Eduard Hovy, Carnegie-Mellon University
Deane L. Root, University of Pittsburgh
Chris Fox, University of Essex
2 May: Training data release
16-20 June: Download of test questions by participants and upload of results
27 June: Publication of results to participants
28 September: Working notes paper deadline
Note that this task is a "Brave New Task" and 2014 is the first year that it is running in MediaEval. If you sign up for this task, you will be asked to keep in particularly close touch with the task organizers concerning the task goals and the task timeline. Note that the timeline of this task runs with a different rhythm that other MediaEval tasks.