Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorHoogeveen, Suzanne
dc.contributor.authorKnol, Sara
dc.date.accessioned2025-08-21T00:02:11Z
dc.date.available2025-08-21T00:02:11Z
dc.date.issued2025
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/49822
dc.description.abstractThis study examines (1) how analytical decisions contribute to variability in many-analyst studies and (2) whether specific decisions can be identified as key drivers. Several models, varying in complexity, were trained and validated on a synthetic multiverse dataset and tested for generalization on the many-analyst dataset from Breznau et al. (2022). While non-linear models performed well on the multiverse dataset (XGBoost R2 = 0.96), none generalized to the many-analyst dataset (R2 ~ 0.0), possibly due to noise or the absence of key decisions in the synthetic data. SHAP values and feature importance highlighted that choices about variables, especially type of independent variables was most impactful. Although current models failed to explain variance in many-analyst settings, findings suggest that efforts to explain variability in many-analysts projects should employ complex models capturing non-linear relationships and emphasize the choice of variables.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectHidden uncertainty in data analysis: Understanding sources of variability in many-analyst projects
dc.titleHidden uncertainty in data analysis: Understanding sources of variability in many-analyst projects
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsmany-analyst projects, meta research, analytical decisions, multiverse analysis, machine learning
dc.subject.courseuuApplied Data Science
dc.thesis.id52067


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record