dc.rights.license | CC-BY-NC-ND | |
dc.contributor.advisor | Schoot, Rens van de | |
dc.contributor.author | Caklovic, Ana | |
dc.date.accessioned | 2022-09-09T00:02:48Z | |
dc.date.available | 2022-09-09T00:02:48Z | |
dc.date.issued | 2022 | |
dc.identifier.uri | https://studenttheses.uu.nl/handle/20.500.12932/42409 | |
dc.description.abstract | Feature extraction is the process of transforming the raw data into features that the model will be trained on while trying to preserve as much information as possible. Choosing the proper feature extractor can greatly affect the performance of a classifier. Feature extraction has evolved from the older techniques such as tf-idf and Doc2Vec to transformers that have already been pre- trained on large corpora. However, although the newer techniques seem promising, it is not always clear when and why one feature extractor may outperform another. The aim of this study is to examine if state-of-the-art feature extractors (i.e., transformers like RoBERTa, MPNET, and SPECTER) can outperform classical feature extractors (i.e., tf-idf and Doc2Vec) when classifying systematic reviews as relevant or irrelevant. The study involved running multiple simulations with the ASReview software to see how well the different feature extractors (in combination with various classifiers) classified research articles as relevant or irrelevant. The results indicated that a tf-idf feature extractor, in combination with a Naive Bayes classifier, outperformed all other combinations, including the sentence transformer feature extractors. | |
dc.description.sponsorship | Utrecht University | |
dc.language.iso | EN | |
dc.subject | Name of Topic: Saving Time and Sanity with AI - ASReview
Name of Thesis: Out with the Old and in with the New? - A Comparison of Classical vs. State-of-the-Art Feature Extractors in the Context of Systematic Reviews | |
dc.title | Out with the Old and in with the New? - A Comparison of Classical vs. State-of-the-Art Feature Extractors in the Context of Systematic Reviews | |
dc.type.content | Master Thesis | |
dc.rights.accessrights | Open Access | |
dc.subject.keywords | data science; AI; systematic review; feature extraction; transformers; SBERT, RoBERTA; ASReview | |
dc.subject.courseuu | Applied Data Science | |
dc.thesis.id | 8872 | |