Datasets with reference analysis
Algorithms in computational music analysis need to be evaluated. This task is not so easy – there are many different analyses of the same piece that are musically relevant. Howvever, some analysis are definitely more correct than others, and reference annotations can be used to evaluate some parts of analysis algorithms.
As computer scientists, we would like to have computer-readable reference datasets that may be used as a ground truth to evaluate MIR and CMA algorithms. But as music theorists, we know that there is not only one correct analysis of a given piece: listeners, players, or analysts often disagree or at least propose several points of view.
Anyway, there is consensus about some analytical elements by many music theorists, players or listeners. The fact that reaching consensus may be difficult on some points should not prevent us from trying to formalize some elements.
Dataset for fugue analysisMathieu Giraud, Richard Groult, Florence Levé
This dataset gives a reference analysis for the 24 fugues of the first book of Bach's Well-Tempered Clavier (WTC I, BWV 846-893) and the 12 first fugues from Shostakovich 24 Preludes and Fugues (op. 57, 1952). These annotations are based on several musicological sources as well as on our own analysis. The file gives the symbolic position (measure number and position in measure) of subjects (S) and counter-subjects (CS), as well as cadences and pedals. We also report slight modifications of S/CS (actual start with respect to the time signature, delayed resolutions...).
As in any analytical work, there may be no consensus between musicologists for some analytic elements. This is true even for fundamental elements such as the exact definition of the subject: In 8 of the 24 Bach fugues, at least two sources disagree on the end of the subject. We indicate these alternative subject definitions in the file (but do not report alternative CS).
We collected these data firstly to evaluate our own algorithms on fugue analysis, but they might also be useful in other situations, for instance in evaluating algorithms for pattern extraction or structure analysis.
- Dataset: fugues.truth (release 2013.12)
- Description of the syntax of the file
- 2013.12: First release on 12 Shostakovitch fugues + minor updates on Bach fugues (960 annotations)
- 2013.05: First release on 24 Bach fugues (610 annotations)
The annotations include all complete subjects and counter-subjects, as well as pedals, and, for Bach, cadences. Further releases will also include also incomplete occurrences of S/CS. We welcome any feedback or suggestions.
ReferencesIf you use this dataset, please cite the following reference:
- Mathieu Giraud, Richard Groult, Emmanuel Leguy, Florence Levé, Computational Fugue Analysis, Computer Music Journal, 39(2), 2015
- Siglind Bruhn, J. S. Bach's Wohltemperiertes Klavier, Analyse und Gestaltung, Edition Gorz, 2006, ISBN 3-938095-05-9
- Siglind Bruhn, J. S. Bach's Well-Tempered Clavier. In-depth Analysis and Interpretation, 1993, ISBN 962-580-017-4, 962-580-018-2, 962-580-019-0, 962-580-020-4.
- Claude Charlier, Pour une lecture alternative du Clavier bien tempéré, 2009, Éditions Jacquart
- Hermann Keller, Das Wohltemperierte Klavier von Johann Sebastiann Bach, 1965, Bärenreiter
- Ebenezer Prout, Analysis of J.S. Bach's forty-eight fugues (Das Wohltemperirte Clavier). E. Ashdown, London, 1910.
- Donald F. Tovey, Forty-Eight Preludes and Fugues, J.-S. Bach, 1924, Associated Board of the Royal Schools of Music
- Denis V. Plutalov, Dmitry Shostakovich's Twenty-Four Preludes and Fugues op. 87., Ph.D. thesis, Univ. Nebraska, 2010
- 48 Jewels and 24 Jewels, www.earsense.org