Machine-Learning-Based Dimensionality Assessment for Cognitive Diagnosis Models

Neele, Rosalie

dc.rights.license	CC-BY-NC-ND
dc.contributor.advisor	Goretzko, David
dc.contributor.author	Neele, Rosalie
dc.date.accessioned	2025-08-21T01:01:43Z
dc.date.available	2025-08-21T01:01:43Z
dc.date.issued	2025
dc.identifier.uri	https://studenttheses.uu.nl/handle/20.500.12932/49910
dc.description.abstract	This research aimed to select, tune, and interpret a supervised Machine Learning (ML) model in search of a model that could correctly predict the number of attributes, i.e., dimensionality assessment, in Cognitive Diagnosis Models (CDMs). These objectives were achieved by benchmarking various supervised ML algorithms, tuning the best-performing model, and applying interpretable ML in the form of counterfactual explanations. A large-scale simulated dataset of 607,579 observations and 946 predictors was used in this research. Feature selection was used to reduce the number of predictors to 142, after which the analysis was performed. An ensemble model combining random forest and XGBoost performed best among other supervised ML models, with a validation accuracy of 56.0%. Hyperparameter tuning using Model-Based Optimisation did not further increase the accuracy of the model. The final model evaluation on unseen test data achieved an accuracy of 56.1%. Generated counterfactuals revealed relevant predictors influencing model predictions and showed that, on average, 26 predictors needed to be altered to correct misclassifications. Despite limitations in model performance, the chosen model still provided meaningful improvement over the baseline of 11% and the counterfactuals offered insight into the complexity of dimensionality assessment in CDMs.
dc.description.sponsorship	Utrecht University
dc.language.iso	EN
dc.subject	Machine-Learning-Based Dimensionality Assessment for Cognitive Diagnosis Models using benchmarking to select the best-performing Machine Learning model. The best model was tuned using Model-Based Optimization, after which counterfactual explanations were generated to interpret the final model.
dc.title	Machine-Learning-Based Dimensionality Assessment for Cognitive Diagnosis Models
dc.type.content	Master Thesis
dc.rights.accessrights	Open Access
dc.subject.keywords	Cognitive Diagnosis Models (CDMs); dimensionality assessment; Machine Learning (ML); supervised ML; interpretable ML; counterfactual explanations
dc.subject.courseuu	Applied Data Science
dc.thesis.id	51947

Files in this item

Name:: ADS thesis (7013590).pdf
Size:: 896.8Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Theses

Show simple item record

Machine-Learning-Based Dimensionality Assessment for Cognitive Diagnosis Models

Files in this item

This item appears in the following Collection(s)

Related items

Modeling dual-task performance: do individualized models predict dual-task performance better than average models? ﻿

Modelling Wastewater Quantity and Quality in Mexico -- using an agent-based model ﻿

Modelling offshore wind in the IMAGE/TIMER model ﻿

Modeling dual-task performance: do individualized models predict dual-task performance better than average models?

Modelling Wastewater Quantity and Quality in Mexico -- using an agent-based model

Modelling offshore wind in the IMAGE/TIMER model