Performance Analysis and Optimization of Music Segmentation Algorithms with a Focus on Jazz
Summary
Music segmentation is a crucial task in the field of Music Information Retrieval (MIR). Despite a wide range of research in this area, almost all existing segmentation algorithms are designed based on pop. In this study, a comparative evaluation was performed on the Jazz Structured Dataset (JSD) and the Beatles dataset using existing segmentation algorithms. The results show that traditional algorithms have low performance on jazz. To address the challenges posed by jazz, two optimization methods are proposed: feature engineering, employing dual-feature fusion and time-dependent weighting strategies, and incorporating a regularity constraint in the post-processing. While feature engineering did not enhance performance, the introduction of a regularity constraint significantly improved segmentation precision. Additionally, this study explores the application of a Temporal Convolutional Network (TCN) for jazz segmentation. Although the TCN did not outperform traditional algorithms, the potential of deep learning methods remains to be explored when sufficient training data is available.