Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributorLientje Maas & Zoë ten Napel
dc.contributor.advisorBrinkhuis, Matthieu
dc.contributor.authorLopes Motoki, Isabela
dc.date.accessioned2025-08-21T00:02:20Z
dc.date.available2025-08-21T00:02:20Z
dc.date.issued2025
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/49825
dc.description.abstractThis study is aimed at the automatic evaluation of curriculum alignment. Curriculum alignment refers to the extent to which learning objectives, instructional activities, and assessments are coherently aligned. Traditionally, measuring this alignment is a time-consuming and often subjective process, since it typically involves evaluating all educational materials with the learning objectives of the curriculum. To address this, the research explores the use of large language models (LLMs) to automate the annotation of Dutch assessment questions with subject-specific concepts. Specifically, it investigates both generative (GPT-4.1 nano) and non-generative (mBERT) models using a labeled dataset of Dutch statistics questions. Results indicate that LLMs show strong potential in this domain: GPT achieved up to 71.1% accuracy and 62.2% macro F1 score, while mBERT reached 91.7% accuracy and 83.7% macro F1 score. Additionally, prompt engineering significantly enhances GPT’s performance, leading to substantial gains. The findings also highlight the importance of careful adaptation and evaluation across diverse educational contexts and task types, as performance varied depending on question categories and subject matter. This research contributes to the integration of AI in education by providing an effective solution for question annotation and offering insights into which approaches are better suited for different educational scenarios. As a result, educators can better align assessments with learning objectives and enhance the overall learning experience.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectThis study is aimed at the automatic evaluation of curriculum alignment. Specifically, it investigates both generative (GPT-4.1 nano) and non-generative (mBERT) models using a labeled dataset of Dutch statistics questions.
dc.titleAutomatic Annotation of Dutch Educational Assessment Questions using Large Language Models
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsCurriculum Alignment; LLM; mBERT; GPT4.1; Annotation of Educational Questions; Dutch Questions
dc.subject.courseuuApplied Data Science
dc.thesis.id52069


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record