| dc.rights.license | CC-BY-NC-ND | |
| dc.contributor.advisor | Bosch, Antal van den | |
| dc.contributor.author | Buijse, Teun | |
| dc.date.accessioned | 2025-09-04T23:01:40Z | |
| dc.date.available | 2025-09-04T23:01:40Z | |
| dc.date.issued | 2025 | |
| dc.identifier.uri | https://studenttheses.uu.nl/handle/20.500.12932/50345 | |
| dc.description.sponsorship | Utrecht University | |
| dc.language.iso | EN | |
| dc.subject | This thesis compares GPT-2 and TiMBL in next-word prediction. GPT-2 excels in accuracy but lacks lexical diversity. TiMBL models, especially TiMBL-closest, offer richer vocabulary but lower accuracy. A hybrid model combines both, slightly boosting TiMBL's accuracy but favoring frequent words like GPT-2. The study highlights a trade-off between predictability and lexical richness. | |
| dc.title | The Predictability-Diversity Trade-Off in Language Modeling: A Comparative Analysis of Memory-Based and Transformer-Based Next-Word Prediction | |
| dc.type.content | Master Thesis | |
| dc.rights.accessrights | Open Access | |
| dc.subject.courseuu | Applied Data Science | |
| dc.thesis.id | 53669 | |