The Effect of Deep Learning-based Source Separation on Predominant Instrument Classification in Polyphonic Music
Summary
Music instrument classification is the task of detecting individual music instruments in music tracks.
It remains a challenging task, particularly in polyphonic music. Prior research has shown that
analytical-based music source separation can increase the performance of instrument classification.
Music source separation is the art of extracting isolated instrument groups called stems from music
tracks. We propose a novel deep learning-based source separation model in the time-frequency
domain that learns to generate a combination of the ’vocals and other’ stems. Additionally, we
develop a postprocessing algorithm that increases the subjective performance of these stems. We
also compare the objective performance between these raw and postprocessed stems and measure
which of the stems positively impacts instrument classification. We find that only the postprocessed
stems positively impact the performance of instrument classification. In addition, we perform
instrument-wise analysis to examine which classified instruments are most affected by music source
separation. We find that the cello, clarinet, piano and violin were the only instruments that
were positively impacted. This research confirms the importance of the source-separated stem’s
quality. The instrument-wise analysis gives insights into which instruments benefit most from
source separation and what source separation quality improvements are needed to increase the
performance of instrument classification.