Digital Musicology
How can music be analyzed and how can musical structures be explained? These are questions that arise time and again in music research.
Helen Gebhart - The Digital and Cognitive Musicology Lab at the École Polytechnique Fédérale de Lausanne (EPFL) uses mathematical models to explain musical phenomena. How can mathematics and music research be combined? Daniel Harasim, a postdoctoral researcher in Lausanne, studied mathematics and was already able to combine mathematics with his personal love of jazz in his dissertation. In his work entitled The Learnability of the Grammar of Jazz: Bayesian Inference of Hierarchical Structures in Harmony he created a computer simulation that illustrates the process of learning jazz music. The key questions for his research are: How can musicological hypotheses, music analytical theories and music cognitive phenomena be explored using mathematical models? In the following, he talks about the opportunities and challenges of digital musicology.
Daniel Harasim, what is Digital Musicology?
As digital musicology is very new, it is difficult to define. However, four sub-areas can be clearly enumerated.
A first area includes the digitization of originals in archives, but also projects such as the Music Encoding Initiative with the Music Encoding Conference. Another more practical area deals with questions of artificial intelligence and its application in automated analysis, composition and the interaction between music machines and humans.
A third field of digital musicology is intercultural research. It is hoped that the objectivity of the data will allow more objective comparisons to be made. However, data-based research can also be problematic, because just because data is used does not automatically make it objective. It depends on how the data is interpreted and, in particular, which data is considered at all. However, digital data offers great potential. In order to exploit this potential, new models must be developed so that ethnomusical data can be handled in a contemporary way.
The fourth major part is the digitization of musical analyses. For example, if you have digitized all of Beethoven's string quartets and want to analyse the harmonics and modulation structure, you have to describe this in a digital form that a computer can easily read. This is then the interface to computer modeling and also to deeper questions about how music works. The existence of digital data is therefore a prerequisite for the creation of computer models.
There are various motivations for computer modeling. In the Digital and Cognitive Musicology Lab, which is headed by Martin Rohrmeier, computer modeling is concerned with the representation of cognitive processes and this also in close contact with psychology. We derive theses from music theory and try to summarize them in a clear mathematical model, which we then test in psychological experiments.
How do these mathematical models work?
A whole class of models that I like to use are so-called generative models. These models generate something. You can imagine that a composer has "composed" data. This data can now be analyzed by recreating the generation process in the computer model. The analysis takes place through this attempt at recreation. Recreating a piece of music in this way is difficult, but if it works, you can gain insights into the principles underlying the music through this generative process.
In our study Exploring the Foundations of Tonality: Statistical Cognitive Modeling of Modes in the History of Western Classical Music (2021) we used a generative algorithm and tried to generate the tonalities of different centuries. The data for this consists of over 13,000 pieces of music in the form of MIDI files in which only the occurring tones (or pitch classes) were counted. All other aspects of the music were excluded, even the temporal sequence of the tones. This object of observation was sufficient to obtain a concept of major and minor through the computer model. The computer didn't know anything about major and minor beforehand, but figured it out based on the data. In contrast to traditional musicology, where a more holistic approach is taken, here only a single aspect is considered with a simple model, but 1000 composers are analyzed in comparison. In this way, musicological questions can be examined quantitatively. However, you have to be very careful not to make any false assumptions. It is essential to know the musicological approach and the music-historical context in order to see whether a model can be interpreted meaningfully at all.
The data for this study comes from Classical Archives, a portal where anyone can enter pieces of music. Doesn't this user-generated data set result in a distorted picture of music history that is more reflective of today's musical canon?
This is a difficult question that does not yet have a definitive answer. It depends on what question you ask of the data set. In our study, we wanted to show the simplicity of the method of counting how many tones occur so that the computer model can learn what modes there are. The compilation of the data set is indeed a weak point and there are also errors in this data set. But since we have 13,000 pieces in it, this is no longer statistically significant. We also work with very controlled and high-quality data sets, such as the Annotated Beethoven Corpus (ABC), which contains harmonic analyses of all of Beethoven's string quartets. My colleagues recently published a corpus of Mozart's piano sonatas. These smaller data sets can also be mixed with the larger ones to get a comprehensive picture of music.