Comparing Contextualised Embeddings for Predicting the (Graded) Effect of Context in Word Similarity

Albers, J.M.

View/Open

6400507_JorisAlbers_Thesis.pdf (552.7Kb)

Publication date

2021

Author

Albers, J.M.

Metadata

Show full item record

Summary

In this research I examined the differences in different contextualized word embedding models for predicting the graded effect of context in word similarity. Each model was tested on a task in which the degree and direction of change of word similarity of two word pairs had to be predicted. This was later compared to human-annotated scores. I found that the BERT architecture works exceptionally well compared to other Transformer based models on the English language. I also found that the multilingual BERT models offers obtains the best score on the low resource languages Finnish, Hungarian and Slovenian. Furthermore I found that stacked embeddings, in which multiple models get combined, offer room for improvement for already performing models. At last I recommend further research in which more models can get compared and the further examination of stacked embeddings.

URI

https://studenttheses.uu.nl/handle/20.500.12932/1176

Collections

Theses