Music Discovery through Visualisation of Audio Features: Introducing SoundShapes
Summary
Popular music streaming platforms do not typically display any audio-derived information other than track duration. This limits users’ ability to guide their own exploration of music beyond listening to each song recommended by the algorithm or making decisions based on their knowledge of the artist or other metadata. This work proposes visual thumbnails generated from audio features to facilitate music discovery: SoundShapes.
SoundShapes visualise the mood and timbre characteristics of the audio. The valence-arousal scale approximates mood. Timbre is represented by the instruments used and an abstract representation of the genre. Various extraction methods were used to facilitate feature extraction, including low-level signal processing and the use of a convolutional neural network.
A prototype of a user interface was built and used to evaluate the SoundShapes. Although the sample was small and not representative of a wider user population, the results of the evaluation show potential for wider user acceptance.