Translating Legal Texts to B1 Dutch Language Level

Hanhart, Maurits

View/Open

ADS_Maurits_Hanhart_Thesis_simple_text.pdf (2.371Mb)

Publication date

2025

Author

Hanhart, Maurits

Metadata

Show full item record

Summary

This research investigates the potential of large language models to transform the accessibility of Dutch legal texts for B1-level readers, without sacrificing legal accuracy. Five models: GPT-4o, Claude Sonnet 4, Gemini 1.5 Pro, UL2-T5 and a fine-tuned Meta-LLaMA-3.1-8B-Instruct are evaluated on a dataset of legal summaries from voorRecht-rechtspraak. The evaluation pipeline integrates automatic metrics (BERTScore, CEFR-based NT2Lex), an LLM-as-a-judge framework, and validation by both legal and linguistic experts. Results show that recent large language models, particularly Claude Sonnet 4 and GPT-4o, can reliably produce simplified legal texts that are much more accessible to non-experts, while largely maintaining the essential legal meaning and accuracy. The LLM-as-a-judge framework and expert reviews both confirm strong performance across key criteria, highlighting significant progress in automated legal simplification. Although occasional shortcomings persist, these findings demonstrate that with further refinement, large language models have the potential to bridge the gap between complex legal language and public understanding.

URI

https://studenttheses.uu.nl/handle/20.500.12932/50462

Collections

Theses