Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorGerard Vreeswijk, Qixiang Fang
dc.contributor.authorDuncker, P.B.
dc.date.accessioned2020-09-14T18:00:13Z
dc.date.available2020-09-14T18:00:13Z
dc.date.issued2020
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/37640
dc.description.abstractMOOCs have become commonplace in distance learning over the past decade, but they are facing high student dropout rates. A warning system that could identify at-risk students could decrease the withdrawal rate. Many prior studies investigated student dropout prediction models in MOOCs, but most of these studies used a single course or non-dynamic dataset. These models tend to overfit on dynamic real-world data since they trained on a small specific dataset. This paper contributes to the body of research by investigating student dropout performance of the XGBoost algorithm and evaluating the most important predictors at each stage on a diverse and dynamic dataset. For the analysis the OULA dataset is used, the dataset is pre-processed into three main predictors: demographics, assessment and log data (VLE interaction data). It is split up in eleven intervals wherein each interval the data gets richer. Two analyses are performed on the data. The first analysis is done on the dataset where the withdrawal data isn’t removed, this resulted in a prediction accuracy between 0.76 and 0.86 and the most important predictors are log and assessment data. The second analysis is done on the dynamic dataset, the data of dropped out students are removed after they withdrew from a course, this resulted in a performance that couldn’t beat the accuracy threshold. This implicates that XGBoost isn’t able to predict dropouts on a dynamic dataset.
dc.description.sponsorshipUtrecht University
dc.format.extent109525
dc.format.mimetypeapplication/vnd.openxmlformats-officedocument.wordprocessingml.document
dc.language.isoen
dc.titleIdentifying at-risk students across different stages of distance learning courses and identifying their most relevant predictors at each stage.
dc.type.contentBachelor Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsDistance Learning, Machine Learning, MOOCs, OULAD, Student dropout, XGBoost
dc.subject.courseuuKunstmatige Intelligentie


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record