Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorSchoot, Rens van de
dc.contributor.authorAngeren, Marco van
dc.date.accessioned2025-08-21T00:02:59Z
dc.date.available2025-08-21T00:02:59Z
dc.date.issued2025
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/49838
dc.description.abstractThis paper introduces MENCOD (Multi-modal ENsemble Citation Outlier Detector), a novel approach for identifying outliers in academic literature screening. In this context, an outlier refers to a relevant paper that was not retrieved before the stopping rule of an active learning pipeline was triggered—typically because it was ranked much lower than other relevant papers. MENCOD addresses this by proposing a two-phase process: after stopping, a new model is trained using additional information not exploited in the first phase, such as citation networks and metadata. The method combines multiple Local Outlier Factor (LOF)-based models and an isolation forest, leveraging both structural and semantic features. Semantic similarity is computed using SPECTER2 embeddings and cosine similarity. Evaluated on three datasets from the Synergy project (Hall, Jeyaraman, and Appenzeller), MENCOD consistently reprioritized missed relevant papers more effectively than the baseline active learning approach. The improvements were 86.5%, 29.8%, and 75.7%, respectively—amounting to thousands of documents that no longer require manual screening. Although still a conceptual prototype, MENCOD shows strong potential for enhancing the recall of relevant literature in large-scale screening tasks.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectDetecting outliers in scientific papers ranked by relevance by applying machine learning algorithms
dc.titleIntroducing MENCOD: Multi-modal ENsemble Citation Outlier Detector
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.courseuuApplied Data Science
dc.thesis.id52091


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record