On Music Structure Analysis: Machine learning implementations of the Segmentation by Annotation approach
Summary
This thesis proposes a novel approach to Music Structure Analysis (MSA). This approach implements the Segmentation by Annotation (SbA) approach to MSA, using a convolutional neural network (CNN) and an artificial neural network using Long Short-Term Memory (LSTM) units. An overview of the current advances in music structure analysis is given as well as the use of the proposed architectures in similar research fields. A description of the evaluation methods is provided in which the proposed architectures show promising results on the custom ground truth used. This custom ground truth is a modified version of the humanly annotated segments found in the internet archives subset of the SALAMI dataset. The ground truth is modified by reducing the amount of unique high-level segment functions from 26 to 9. By comparing the SbA approach to the (more symbolic) Distance-based Segmentation and Annotation approach, a comparison between using machine learning and non-machine learning techniques can be made. Future research is proposed to enhance the segmentation by annotation approach as well as music structure analysis in general.