Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorTelea, Alex
dc.contributor.authorGrosu, Cristian
dc.date.accessioned2024-01-09T00:01:18Z
dc.date.available2024-01-09T00:01:18Z
dc.date.issued2024
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/45791
dc.description.abstractTraining classifier models with semi-labeled datasets, which often have only a limited number of labeled samples, is challenging. This thesis proposes a user-centric methodology for pseudo-labeling semi-labeled data, fusing automatic pseudo-labeling algorithms with user-driven correction of mislabeled data points. The methodology is supported by a number of visual analytics approaches involving sample visualization via dimensionality reduction techniques and visualization of classifier decision boundaries using so-called Decision Boundary Maps (DBMs). These visuals allow users to find regions of uncertainty where automatic pseudo-labeling may have made errors and correct these accordingly. To speed up the visual analytics loop, we propose various heuristics for efficient and accurate DBM computation. Conducted user experiments show that both domain expert and non-expert users were able to consistently correct wrong labels and improve classifier performance for different datasets and classifier models, with only a limited effort in a limited amount of time. The study underscores the importance and potential of visualization tools in the context of semi-labeled datasets and semi-supervised learning and provides a foundation for future research in this area.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectDecision Boundary Mapping techniques for improving semi-supervised machine learning models. The Ethics and Privacy Quick Scan of the Utrecht University Research Institute of Information and Computing Sciences classified this research as low-risk with no fuller ethics review or privacy assessment required.
dc.titleDecision Boundary Maps for Supporting User-Driven Pseudo-labeling
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsSemi-supervised learning; Pseudo-labeling; Dimensionality Reduction; Decision Boundary Maps; Data visualization
dc.subject.courseuuComputing Science
dc.thesis.id26897


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record