Code Generation on a Diet: A Comparative Evaluation of Low-Parameter Large Language Models

Carboni, Leonardo

dc.rights.license	CC-BY-NC-ND
dc.contributor.advisor	Gatt, A.
dc.contributor.author	Carboni, Leonardo
dc.date.accessioned	2024-07-24T23:07:19Z
dc.date.available	2024-07-24T23:07:19Z
dc.date.issued	2024
dc.identifier.uri	https://studenttheses.uu.nl/handle/20.500.12932/46906
dc.description.abstract	In the constantly evolving field of software development, the demand for automated code generation has significantly increased since the release of AI based tools like ChatGPT and GitHub Copilot. These tools, powered by Large Language Models (LLMs), typically require server requests due to their closed source nature and substantial computational costs. This thesis investigates the potential of smaller, locally runnable low-parameter LLMs in the context of code generation. The research begins with an overview of the state of the art of coding and its anticipated evolution thanks to AI integration. It continues by analyzing the current landscape of LLMs by explaining the underlying mechanisms of these models and listing several of the most important low-parameter models, such as Mistral, CodeLlama and DeepSeek-Coder. The study also examines the impact of techniques like fine-tuning, instruction-tuning and quantization on improving performance and efficiency. Additionally, it reviews the available code evaluation techniques, focusing on match-based and functional metrics, and discusses the datasets used to evaluate the models. The methodology employed involves selecting suitable datasets and models, generating code samples, and evaluating them using both types of metrics. The evaluation highlights the limitations of match based metrics in capturing the LLM’s true code generation performance and emphasizes the importance of functional metrics like pass rates. The findings indicate that while larger models generally outperform smaller ones, the performance gap is narrowing, thanks to higher quality and more domain-specific training data. Moreover the study confirms the effectiveness of the aforementioned fine tuning and quantization techniques in improving the model’s capabilities and lowering the requirements needed to run them. The thesis concludes by suggesting that with continuous advancements, smaller models could play a crucial role in making high quality code generation more accessible and sustainable.
dc.description.sponsorship	Utrecht University
dc.language.iso	EN
dc.subject	Analysis of Large Language Models in the context of Natural Language Generation for code
dc.title	Code Generation on a Diet: A Comparative Evaluation of Low-Parameter Large Language Models
dc.type.content	Master Thesis
dc.rights.accessrights	Open Access
dc.subject.courseuu	Artificial Intelligence
dc.thesis.id	34827

Files in this item

Name:: Master_Thesis.pdf
Size:: 1.905Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Theses

Show simple item record