Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorWegmann, Anna
dc.contributor.authorKole, Ruben
dc.date.accessioned2025-02-01T01:01:26Z
dc.date.available2025-02-01T01:01:26Z
dc.date.issued2025
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/48450
dc.description.abstractCurrently, there are no models trained specifically trained on the creation of representation of Dutch linguistic style. Neither has a task been developed to evaluate and verify the created embeddings. In this thesis, I construct a model that creates a style representation for Dutch and I create evaluation data to test if the created representation truly represents style. To create these embeddings, RobBERT-base is fine-tuned using the contrastive authorship verification task. To find the best-performing model, two datasets are constructed, and the loss function is experimented with as well as the value for the margin. The performance of the fine-tuned models falls in line with the results that are found in similar research for English style. For the evaluation, the STEL dataframe is adapted to a Dutch version. Some categories are copied from the English variant and translated to properly reflect Dutch style. Other categories are novel in this version. There are two versions of the STEL task, one of which controls for content to ensure that the embedding makes a decision based on style. The performance of the embeddings on the STEL task shows similarities to the results that are found in research into the English equivalent and shows that for most tasks the fine-tuned model learns to perform better on the tasks that control for style than the baseline model does. Therefore, this thesis concludes that it is possible to utilize methods devised for creating and evaluating English style representations and transform these into a Dutch version that show similar results as the original do
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectDutch style model trained using authorship verification
dc.titleDutch style model trained using authorship verification
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.courseuuArtificial Intelligence
dc.thesis.id42562


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record