Comparison of Acoustic Feature Representation Methods for Apparent Personality Recognition

Zhang, Yizhe

View/Open

YizheZhangFinal.pdf (733.5Kb)

Publication date

2023

Author

Zhang, Yizhe

Metadata

Show full item record

Summary

This thesis examines the performance of Fisher Vector representations in classifying personality traits from audio. The Chalearn LAP First Impression dataset is used, which is a multimodal dataset. The audio modality of the dataset is focused on, and different audio feature extraction methods, including wav2vec 2.0, openSMILE, and public dimensional emotion model (PDEM), are studied for their performance on the classification task. Different encoding approaches, such as Fisher Vector, are also studied to see how they affect the performance of the classifier. The results of this thesis suggest that Fisher Vector representations are not the best choice for classifying personality traits from audio for the certain dataset. However, other feature extraction methods, such as openSMILE LLDs and PDEM, can achieve good performance on this task. The thesis also provides some insights into the selection of parameters for feature engineering and the interpretability of Fisher Vector representations.

URI

https://studenttheses.uu.nl/handle/20.500.12932/45272

Collections

Theses