Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorKaya, Heysem
dc.contributor.authorZhang, Yizhe
dc.date.accessioned2023-09-30T00:00:53Z
dc.date.available2023-09-30T00:00:53Z
dc.date.issued2023
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/45272
dc.description.abstractThis thesis examines the performance of Fisher Vector representations in classifying personality traits from audio. The Chalearn LAP First Impression dataset is used, which is a multimodal dataset. The audio modality of the dataset is focused on, and different audio feature extraction methods, including wav2vec 2.0, openSMILE, and public dimensional emotion model (PDEM), are studied for their performance on the classification task. Different encoding approaches, such as Fisher Vector, are also studied to see how they affect the performance of the classifier. The results of this thesis suggest that Fisher Vector representations are not the best choice for classifying personality traits from audio for the certain dataset. However, other feature extraction methods, such as openSMILE LLDs and PDEM, can achieve good performance on this task. The thesis also provides some insights into the selection of parameters for feature engineering and the interpretability of Fisher Vector representations.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectIn this paper, the author tested the performance of feature vectors generated using Fisher Vector for classification problems.
dc.titleComparison of Acoustic Feature Representation Methods for Apparent Personality Recognition
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsparalinguistics; machine learning; classification; acoustics; feature engineering
dc.subject.courseuuGame and Media Technology
dc.thesis.id24909


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record