View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        Code Generation on a Diet: A Comparative Evaluation of Low-Parameter Large Language Models

        Thumbnail
        View/Open
        Master_Thesis.pdf (1.905Mb)
        Publication date
        2024
        Author
        Carboni, Leonardo
        Metadata
        Show full item record
        Summary
        In the constantly evolving field of software development, the demand for automated code generation has significantly increased since the release of AI based tools like ChatGPT and GitHub Copilot. These tools, powered by Large Language Models (LLMs), typically require server requests due to their closed source nature and substantial computational costs. This thesis investigates the potential of smaller, locally runnable low-parameter LLMs in the context of code generation. The research begins with an overview of the state of the art of coding and its anticipated evolution thanks to AI integration. It continues by analyzing the current landscape of LLMs by explaining the underlying mechanisms of these models and listing several of the most important low-parameter models, such as Mistral, CodeLlama and DeepSeek-Coder. The study also examines the impact of techniques like fine-tuning, instruction-tuning and quantization on improving performance and efficiency. Additionally, it reviews the available code evaluation techniques, focusing on match-based and functional metrics, and discusses the datasets used to evaluate the models. The methodology employed involves selecting suitable datasets and models, generating code samples, and evaluating them using both types of metrics. The evaluation highlights the limitations of match based metrics in capturing the LLM’s true code generation performance and emphasizes the importance of functional metrics like pass rates. The findings indicate that while larger models generally outperform smaller ones, the performance gap is narrowing, thanks to higher quality and more domain-specific training data. Moreover the study confirms the effectiveness of the aforementioned fine tuning and quantization techniques in improving the model’s capabilities and lowering the requirements needed to run them. The thesis concludes by suggesting that with continuous advancements, smaller models could play a crucial role in making high quality code generation more accessible and sustainable.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/46906
        Collections
        • Theses
        Utrecht university logo