View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        Identifying at-risk students across different stages of distance learning courses and identifying their most relevant predictors at each stage.

        Thumbnail
        View/Open
        Thesis.PaulDuncker.docx (106.9Kb)
        Publication date
        2020
        Author
        Duncker, P.B.
        Metadata
        Show full item record
        Summary
        MOOCs have become commonplace in distance learning over the past decade, but they are facing high student dropout rates. A warning system that could identify at-risk students could decrease the withdrawal rate. Many prior studies investigated student dropout prediction models in MOOCs, but most of these studies used a single course or non-dynamic dataset. These models tend to overfit on dynamic real-world data since they trained on a small specific dataset. This paper contributes to the body of research by investigating student dropout performance of the XGBoost algorithm and evaluating the most important predictors at each stage on a diverse and dynamic dataset. For the analysis the OULA dataset is used, the dataset is pre-processed into three main predictors: demographics, assessment and log data (VLE interaction data). It is split up in eleven intervals wherein each interval the data gets richer. Two analyses are performed on the data. The first analysis is done on the dataset where the withdrawal data isn’t removed, this resulted in a prediction accuracy between 0.76 and 0.86 and the most important predictors are log and assessment data. The second analysis is done on the dynamic dataset, the data of dropped out students are removed after they withdrew from a course, this resulted in a performance that couldn’t beat the accuracy threshold. This implicates that XGBoost isn’t able to predict dropouts on a dynamic dataset.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/37640
        Collections
        • Theses
        Utrecht university logo