Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorBosch, Antal van den
dc.contributor.authorBuijse, Teun
dc.date.accessioned2025-09-04T23:01:40Z
dc.date.available2025-09-04T23:01:40Z
dc.date.issued2025
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/50345
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectThis thesis compares GPT-2 and TiMBL in next-word prediction. GPT-2 excels in accuracy but lacks lexical diversity. TiMBL models, especially TiMBL-closest, offer richer vocabulary but lower accuracy. A hybrid model combines both, slightly boosting TiMBL's accuracy but favoring frequent words like GPT-2. The study highlights a trade-off between predictability and lexical richness.
dc.titleThe Predictability-Diversity Trade-Off in Language Modeling: A Comparative Analysis of Memory-Based and Transformer-Based Next-Word Prediction
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.courseuuApplied Data Science
dc.thesis.id53669


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record