Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorVelegrakis, Ioannis
dc.contributor.authorBuis, Mats
dc.date.accessioned2025-01-07T00:01:08Z
dc.date.available2025-01-07T00:01:08Z
dc.date.issued2025
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/48350
dc.description.abstractLarge Language Models (LLMs) consume significant amounts of energy during inference, espe cially for computationally expensive tasks like code generation, which leads to environmental con cerns. This work aims to reduce the energy consumption during inference without compromising model performance. The energy consumption of Qwen2.5-Coder-7B-Instruct, Meta-LLaMA 3.1 8B-Instruct, and DeepSeekCoder-V2-Instruct-16B was evaluated on BigCodeBench, a benchmark that consists of 1,140 diverse coding tasks, using a software-based energy measuring approach. The relations between task nature, batch size, model size, fine-tuning, Activation-Aware Weight Quan tization (AWQ), and GPTQ with 8-bit and 4-bit precision were investigated for a variety of models including the Qwen2.5 models. Results indicate that task nature significantly affects energy con sumption across all tested models, while batch size has a minor effect. Notably, the Meta-LLaMA model consumed 130.77% more energy than the DeepSeekCoder model while achieving lower ac curacy. Fine-tuning, AWQ, GPTQ-INT8, andGPTQ-INT4quantizations reducing energy consump tion by up to 19%, 67%, 40%and67%,respectively. GPTQ-INT8models achieved these reductions without significantly reduced accuracy, whereas GPTQ-INT4 models showed slight decreases and AWQshowedsubstantially lower pass@1 scores. This work demonstrates that energy consumption of LLMs can effectively be reduced without significant performance loss, which demonstrates the importance and contributions of innovative research for sustainable AI practices.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectThe thesis dives into the trade-off between the energy efficiency and accuracy during code generation tasks performed by LLMs and shows which optimisation methods are available to ensure that LLMs consume less energy while maintaining accuracy so that they are usable in software production.
dc.titleWalking the Tightrope: Balancing Energy Efficiency and Accuract in LLM-Driven Code Generation
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsLLMs; Energy Efficiency; AI; Code Generation; Software Production; Quantization; Fine-Tuning
dc.subject.courseuuArtificial Intelligence
dc.thesis.id41969


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record