Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorSpruit, M.R.
dc.contributor.advisorAkkerman, S.F.
dc.contributor.authorMeer, T. van der
dc.date.accessioned2020-08-31T18:00:27Z
dc.date.available2020-08-31T18:00:27Z
dc.date.issued2020
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/37201
dc.description.abstractThe analysis of interests from young adolescents in the form of short, colloquial Dutch text is a challenging task for pre-trained neural networks. By quantitative and qualitative tests, four pre-trained language models on the Dutch language are compared and contrasted. Three more language model fine-tuned models are added to test transfer learning capabilities for the qualitative tests. By training a classifier on a named entity recognition- and sentiment analysis task, the models are quantitatively compared. For the qualitative comparison, The outputs from the embedding layer are used to gain insight in relation classification and clustering. A test for ranking interest pair similarities has been developed in order to investigate the semantical understanding of the Dutch language in the models. Furthermore, the clustering capabilities of related interests are examined. Finally, given relation structures in sports, instruments and school courses are brought to a test. BERTje outperforms the other models in the quantitative tasks. However, BERTje performs the worst on the triplets ranking test. RobBERT fine-tuned and FastText show the best results on the triplets analysis. All models lack to show semantical understanding in the clustering analysis. FastText shows the most semantical understanding in the relation structures, though still relatively poor. The outputs from the embedding layer shows that the models do not have a semantical understanding of the Dutch language but fall back on morphological structures. Therefore, these techniques are not ready to be used for interest analysis. Creating a downstream task, data enrichment and knowledge infusion are candidates for improvements on interest analysis.
dc.description.sponsorshipUtrecht University
dc.format.extent2047408
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.titleA Deep Learning Approach to Interest Analysis
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsinterest analysis, natural language processing,
dc.subject.courseuuBusiness Informatics


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record