Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorSchoot, Rens van de
dc.contributor.authorKroft, Mathijs van der
dc.date.accessioned2022-09-09T04:01:45Z
dc.date.available2022-09-09T04:01:45Z
dc.date.issued2022
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/42714
dc.description.abstractActive learning aided abstract screening can alleviate the labour-intensive process of systematic reviewing. In such a learning cycle, a machine learning model suggests the next abstract to be reviewed, and a researcher classifies the abstract as relevant or irrelevant. A systematic review should include all relevant studies, regardless of the language it is conducted in. Machine translation of abstracts helps here, but it is unknown how classification performance changes when abstracts are translated. This study simulates the active learning process with English datasets, and with the same datasets that were machine-translated to German, Spanish and Turkish. A key step in the active learning pipeline is the generation of a vector representation of the text, using a feature extractor. The feature extraction methods tf-idf, Doc2Vec, FastText and SBERT were compared on their classification performance for all languages. The results show that no consistent disadvantage to translation can be found for the selected datasets, except for FastText.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectThis study simulates the active learning process with English systematic review datasets, and with the same datasets that were machine-translated to German, Spanish and Turkish. The feature extraction methods tf-idf, Doc2Vec, FastText and SBERT were compared on their classification performance for all languages.
dc.titleLanguage morphology in active learning aided systematic reviews
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordssystematic review; active learning; simulation study; language morphology; machine translation
dc.subject.courseuuApplied Data Science
dc.thesis.id10472


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record