Comparing Contextualised Embeddings for Predicting the (Graded) Effect of Context in Word Similarity

Albers, J.M.

dc.rights.license	CC-BY-NC-ND
dc.contributor.advisor	Du, Y.
dc.contributor.author	Albers, J.M.
dc.date.accessioned	2021-09-08T18:00:12Z
dc.date.available	2021-09-08T18:00:12Z
dc.date.issued	2021
dc.identifier.uri	https://studenttheses.uu.nl/handle/20.500.12932/1176
dc.description.abstract	In this research I examined the differences in different contextualized word embedding models for predicting the graded effect of context in word similarity. Each model was tested on a task in which the degree and direction of change of word similarity of two word pairs had to be predicted. This was later compared to human-annotated scores. I found that the BERT architecture works exceptionally well compared to other Transformer based models on the English language. I also found that the multilingual BERT models offers obtains the best score on the low resource languages Finnish, Hungarian and Slovenian. Furthermore I found that stacked embeddings, in which multiple models get combined, offer room for improvement for already performing models. At last I recommend further research in which more models can get compared and the further examination of stacked embeddings.
dc.description.sponsorship	Utrecht University
dc.format.extent	566003
dc.format.mimetype	application/pdf
dc.language.iso	en
dc.title	Comparing Contextualised Embeddings for Predicting the (Graded) Effect of Context in Word Similarity
dc.type.content	Bachelor Thesis
dc.rights.accessrights	Open Access
dc.subject.keywords	NLP, contextualized word embedding, BERT, GPT-2, Flair, CoSimLex, word similarity, context
dc.subject.courseuu	Kunstmatige Intelligentie

Files in this item

Name:: 6400507_JorisAlbers_Thesis.pdf
Size:: 552.7Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Theses

Show simple item record