Melodic Similarity among Incipits; a Deep Learning Approach
Summary
This project introduces transformer models to the field of melodic similarity.
Originating from the Natural Language Processing field; transformer
models use sequential data to find relationships between individual
parts of a sequence. Melodies can be regarded as sequences of
notes, making melodies viable as input for transformer models. Deep
learning approaches like RNNs have already proven to compete with the
state-of-the-art alignment algorithm regarding melodic similarity. This
project aims to discover whether a transformer model is capable of finding
groups of similar melodies in the Meertens Tune Collection(MTC).
Using triplet loss, transformers are capable of training an embedding
space with the goal to cluster similar melodies. Here we show that transformers
can reach a MAP score of 0.36 and P@1 score of 0.48 on unseen
data from the MTC dataset. We also made a comparison between different
input forms, where using whole melodies, random triplets and a
complex feature set is preferable over other variations of the input. We
discovered that the tokenisation process put restrictions on the feature
selection process, which negatively impacted model performance. Although
not competing with the state-of-the-art methods, transformers
show potential in solving problems related to melodic similarity.