Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorSchoot, Rens van de
dc.contributor.authorCaklovic, Ana
dc.date.accessioned2022-09-09T00:02:48Z
dc.date.available2022-09-09T00:02:48Z
dc.date.issued2022
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/42409
dc.description.abstractFeature extraction is the process of transforming the raw data into features that the model will be trained on while trying to preserve as much information as possible. Choosing the proper feature extractor can greatly affect the performance of a classifier. Feature extraction has evolved from the older techniques such as tf-idf and Doc2Vec to transformers that have already been pre- trained on large corpora. However, although the newer techniques seem promising, it is not always clear when and why one feature extractor may outperform another. The aim of this study is to examine if state-of-the-art feature extractors (i.e., transformers like RoBERTa, MPNET, and SPECTER) can outperform classical feature extractors (i.e., tf-idf and Doc2Vec) when classifying systematic reviews as relevant or irrelevant. The study involved running multiple simulations with the ASReview software to see how well the different feature extractors (in combination with various classifiers) classified research articles as relevant or irrelevant. The results indicated that a tf-idf feature extractor, in combination with a Naive Bayes classifier, outperformed all other combinations, including the sentence transformer feature extractors.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectName of Topic: Saving Time and Sanity with AI - ASReview Name of Thesis: Out with the Old and in with the New? - A Comparison of Classical vs. State-of-the-Art Feature Extractors in the Context of Systematic Reviews
dc.titleOut with the Old and in with the New? - A Comparison of Classical vs. State-of-the-Art Feature Extractors in the Context of Systematic Reviews
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsdata science; AI; systematic review; feature extraction; transformers; SBERT, RoBERTA; ASReview
dc.subject.courseuuApplied Data Science
dc.thesis.id8872


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record