Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorBosch, Antal van den
dc.contributor.authorHammoud, Amir
dc.date.accessioned2024-10-23T23:02:08Z
dc.date.available2024-10-23T23:02:08Z
dc.date.issued2024
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/48009
dc.description.abstractRecent advancements in large language models (LLMs) such as Meta’s Llama3 have substantially improved machine capabilities in language understanding and generation. However, the extent to which these models memorize versus generalize from their training data remains a critical issue, particularly with privacy and originality concerns. This study evaluates the memorization behavior of two differently-sized Llama3 models (8-billion and 70-billion parameters) across various temperature settings. Testing on Wikipedia data, we analyzed model responses to measure exact text reproduction, employing a mixed-effects model to assess the impact of model size and temperature. Our findings indicate that lower temperatures significantly increase memorization, with the 70-billion parameter model exhibiting a higher propensity for memorization than the 8-billion parameter model. Additionally, the larger model showed greater sensitivity to temperature changes, displaying a sharp decline in memorization at higher temperatures. These results underscore the importance of optimizing temperature settings to balance memorization and creativity in LLMs, and provide concrete evidence of the existence of memorization in these models. Future research should aim to develop robust guidelines and safeguards to mitigate the dual risks of privacy breaches and model exploitation.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectInvestigating the Impact of Temperature on Memorization in Meta's LLaMA3 Models: A Comparative Analysis of 8B and 70B Parameters
dc.titleInvestigating the Impact of Temperature on Memorization in Meta's LLaMA3 Models: A Comparative Analysis of 8B and 70B Parameters
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.courseuuApplied Data Science
dc.thesis.id40409


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record