Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorKunneman, Florian
dc.contributor.authorMathijssen, Ole
dc.date.accessioned2024-07-24T23:03:50Z
dc.date.available2024-07-24T23:03:50Z
dc.date.issued2024
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/46864
dc.description.abstractText simplification aims to make text more readable by reducing the linguistic complexity. This study explores the use of sequence-to-sequence transformer models to simplify Dutch governmental letters, enhancing their readability for individuals with low literacy levels. Various models, including T5-Small, BART-Base, mT5-Small, mBART-Large-50, T5-Base-Dutch, UL2-Small-Dutch and UL2-Small-Dutch-Simplification, were trained on datasets comprising complex and simplified Dutch sentences. These models were evaluated using quantitative metrics such as the Flesch-Kincaid Grade Level, BLEU score and SARI score, complemented by a qualitative analysis. The best-performing model was applied to a dataset of letters provided by the Rijksdienst voor Ondernemend Nederland to produce simplified versions. The study demonstrates that while the models slightly improve readability as indicated by Flesch-Kincaid scores, qualitative analysis reveals significant issues with content preservation and coherence. This highlights the need for further refinement to achieve the desired readability improvement, while maintaining accuracy in Dutch text simplification.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectEnhancing Readability of Governmental Letters Using Large Language Models
dc.titleEnhancing Readability of Governmental Letters Using Large Language Models
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.courseuuApplied Data Science
dc.thesis.id34881


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record