Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorSiebes, Arno
dc.contributor.authorMeelker, C.M.
dc.date.accessioned2021-08-25T18:00:13Z
dc.date.available2021-08-25T18:00:13Z
dc.date.issued2021
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/41186
dc.description.abstractSummary-writing tasks are often used to assess reading comprehension of students. Grading these types of tasks is time-consuming and teachers have difficulty being consistent when grading. The goal of this research is therefore to explore and evaluate the possibilities of automating summary grading. Previous research has shown that students with an extensive mental model, and thus a good understanding of the original text, write high-quality summaries. Linguistic features can therefore be used to measure summary quality. A total of 82 different linguistic features is calculated for a dataset of 914 short Dutch summaries. These summaries have been graded by teachers. Through cross-validated feature selection, an optimal set of features is selected for both a regression and classification model. The regression model can be used to predict a grade and has an explained variance of 0.71. The classification model can be used to predict a 'Fail' or 'Pass' label and has an area-under-the-ROC curve of 0.91. It can therefore be concluded that linguistic feature-based models can successfully be used to automate summary grading. The models developed in this research could potentially replace a second or third reader.
dc.description.sponsorshipUtrecht University
dc.format.extent1181475
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.titleAutomated summary scoring using a linguistic feature approach
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsAES, Machine Learning, Data Science
dc.subject.courseuuApplied Data Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record