Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorDeemter, C.J. van
dc.contributor.authorBruijn, Amber de
dc.date.accessioned2023-04-15T00:00:48Z
dc.date.available2023-04-15T00:00:48Z
dc.date.issued2023
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/43787
dc.description.abstractOne of the difficulties of artificially generating Mandarin Chinese text is the question of which classifier - a linguistic unit obligatory in numeral expressions - to choose in a given context. Several algorithms for classifier choice have recently been developed and assessed using a corpus-based evaluation. The best-scoring algorithm was a BERT classification model. However, evaluating classifiers based on a corpus provides a conservative score: it classifies each non-matching classifier as incorrect, while native speakers might acknowledge multiple different classifiers as a correct option. Since the ultimate goal of NLG should be the generation of texts that are useful to humans, we decided to perform a human evaluation in addition to the corpus-based one. We conducted two experiments; the first was a standard NLG evaluation, and the second was a more linguistically motivated experiment focusing on only true classifiers (a specific subset of Mandarin classifiers). We found that, according to human readers, BERT consistently performs better than the other models, agreeing with the corpus-based evaluation. However, we found no difference in the evaluation scores between BERT and the human-produced sentences in the corpus. This is remarkable, because the corpus-based evaluation suggests a large gap between BERT’s score and the corpus’ score. This result suggests human readers are more accepting of variations in classifier choice than previously thought.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectHuman evaluation of automatically generated classifiers in Mandarin Chinese
dc.titleHuman evaluation of automatically generated classifiers in Mandarin Chinese
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsnatural language generation; classifiers; Mandarin Chinese; human evaluation
dc.subject.courseuuArtificial Intelligence
dc.thesis.id15786


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record