Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorSiebes, A.P.J.M.
dc.contributor.advisorvan Ommen, M.
dc.contributor.authorHavermans, S.A.C.
dc.date.accessioned2021-08-25T18:00:15Z
dc.date.available2021-08-25T18:00:15Z
dc.date.issued2021
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/41192
dc.description.abstractThis thesis compares multiple methods of classification following cosine-similarity calculation from semantic search with Sentence-BERT (SBERT), as well as various class representations in few-shot classification with SBERT. The performance of SBERT is then compared to that of DistilBERT on various natural language processing (NLP) tasks (clickbait classification, sentiment analysis, spam detection and topic classification) and datasets. This is done in an effort to determine for which tasks SBERT semantic search is an effective alternative to fine-tuning more traditional BERT models. The multilingual versions of both SBERT and Distil- BERT are used for topic classification on a German dataset to assess the performance of the multilingual version of SBERT. The best implementation of SBERT semantic search for few-shot classification uses a similarity-based classification as well as average embeddings for class representations. The results show that both SBERT and DistilBERT show signs of diminishing returns at around 25 samples per class when performing few-shot classification. Fine-tuning a DistilBERT model is equal to or outperforms SBERT semantic search on all assessed NLP tasks at a cost of slightly more instability.
dc.description.sponsorshipUtrecht University
dc.format.extent1291144
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.titleAn Implementation and Assessment of Semantic Search Few-Shot Classification
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsNLP; Transformers; BERT; Sentence-BERT; Text
dc.subject.courseuuApplied Data Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record