Detecting Musical Rhetoric Figures with LSTM using Procedurally Generated Synthetic Data
Summary
Musicologists have researched rhetorical techniques applied to baroque music, which includes the works of Johann Sebastian Bach. These musical rhetoric figures come in the
form of rhythmic and melodic patterns, with each figure having the goal of evoking a certain emotion or Christian symbolism. This thesis presents a machine learning approach to
pattern recognition in symbolic music. A Long Short-Term Memory (LSTM) model will
be trained to detect the figura corta, a rhythmical figure, in the cantatas of J.S. Bach.
Since a labeled dataset of Bach’s cantatas does not exist and labeling the data manually
would be extremely time consuming, this thesis approaches the problem by generating a
dataset from scratch. By drawing up laws and rules to which the data should abide, an
algorithm is presented that generates plausible musical fragments as training input based
on these criteria. Then, twelve different parameter settings were explored by training an
LSTM on the resulting datasets. The best performing model was subsequently tested
on six real cantata movements. On average, the LSTM model achieved a precision of
78.53%, a recall of 83.12% and an accuracy of 94.46%. From these results it is concluded
that synthetic data can produce a reliable dataset to train an LSTM that can successfully
classify real data on whether it contains a figura corta or not. Furthermore, this result
implies that the presented method of procedurally generating data produces a varied and
correct dataset. By extension, it implies that the laws and rules proposed, as well as the
representation of the musical data, allow the LSTM to correctly apply its learned features
to real data.