View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        Detecting Musical Rhetoric Figures with LSTM using Procedurally Generated Synthetic Data

        Thumbnail
        View/Open
        Master_Thesis_Niek_de_Gier.pdf (508.2Kb)
        Publication date
        2022
        Author
        Gier, Niek de
        Metadata
        Show full item record
        Summary
        Musicologists have researched rhetorical techniques applied to baroque music, which includes the works of Johann Sebastian Bach. These musical rhetoric figures come in the form of rhythmic and melodic patterns, with each figure having the goal of evoking a certain emotion or Christian symbolism. This thesis presents a machine learning approach to pattern recognition in symbolic music. A Long Short-Term Memory (LSTM) model will be trained to detect the figura corta, a rhythmical figure, in the cantatas of J.S. Bach. Since a labeled dataset of Bach’s cantatas does not exist and labeling the data manually would be extremely time consuming, this thesis approaches the problem by generating a dataset from scratch. By drawing up laws and rules to which the data should abide, an algorithm is presented that generates plausible musical fragments as training input based on these criteria. Then, twelve different parameter settings were explored by training an LSTM on the resulting datasets. The best performing model was subsequently tested on six real cantata movements. On average, the LSTM model achieved a precision of 78.53%, a recall of 83.12% and an accuracy of 94.46%. From these results it is concluded that synthetic data can produce a reliable dataset to train an LSTM that can successfully classify real data on whether it contains a figura corta or not. Furthermore, this result implies that the presented method of procedurally generating data produces a varied and correct dataset. By extension, it implies that the laws and rules proposed, as well as the representation of the musical data, allow the LSTM to correctly apply its learned features to real data.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/42460
        Collections
        • Theses
        Utrecht university logo