A Deep Neural Network Approach to Automatic Birdsong Recognition
MetadataShow full item record
This thesis is a result of research conducted into feature learning techniques with bird sound classification as a goal. To achieve this goal, deep neural networks are created, which are evaluated using recordings from the LifeCLEF 2014 Bird task (BirdCLEF) data set. This dataset contains recordings of over 501 species from South America centred on Brazil. A segmentation algorithm is created that is capable of segmenting the audio files of the Bird task dataset. These segments are used to extract features from the birdsong parts of a recording only. Four datasets are created, containing MFCC and DFT features. These dataset are shuffled and split in a test and train set to train deep neural networks with several topologies, which are capable of classifying elements of the datasets. It is found that the best network is capable of correctly classifying 73% of the segments, outperforming Rotation Forest and Support Vector Machine classification. A submission to BirdCLEF using this deep neural network was placed 6th in the list of 10 participating teams. In a follow-up research it is hypothesised that shuffling the data before splitting introduces overfitting, which can be reduced by not shuffling the datasets prior to splitting, and using Dropout networks. The experiments are repeated, with the same datasets, but without shuffling prior to splitting them into a test and train set. The second experiments show classification accuracies that are closer to the BirdCLEF results, with deep neural networks again capable of outperforming other classification methods. The best deep neural network again outperforms Rotation Forest and Support Vector Machines. This shows that the recent improvements in deep learning can also be applied to bioacoustic research and birdsong research in particular.