Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorDalpiaz, Fabiano
dc.contributor.authorCheng, Junxin
dc.date.accessioned2025-10-08T23:01:27Z
dc.date.available2025-10-08T23:01:27Z
dc.date.issued2025
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/50513
dc.description.abstractAutomated trace link recovery between issues and commits is essential for maintaining requirements traceability, as it reduces the manual effort required in large-scale software projects. This study investigates the effectiveness of large language models (LLMs) in generating semantic representations within a machine learning classification framework for supporting automated trace link recovery. To this end, we formulate three research questions: (1) How effective is the feature representation via LLM embeddings compared to information retrieval (IR) methods, static word embeddings, and bidirectional encoder representations from Transformers (BERT)-based models? (2) What is the relative contribution of textual and non-textual features to supervised issue-commit link classification? (3) Which classification algorithm performs best when using the engineered features? We extract and construct three categories of feature sets: textual, non-textual, and a combination of both, based on data from eight open-source projects. And we apply five models: VSM with TF-IDF, FastText, Word2Vec, Sentence Transformer, and OpenAl's embedding model to evaluate the effectiveness of semantic representations. These models are assessed using two classifiers (Random Forest and XGBoost) in two practical scenarios: trace recommendation and trace maintenance. Evaluation metrics include Precision, Recall, F2, and F0.5 scores, further supported by statistical significance tests and feature importance analysis. The results show that textual features generated by the VSM with TF-IDF consistently outperform other semantic and non-textual features, demonstrating not only the effectiveness of domain-specific term distribution captured by traditional IR methods but also the importance of high-quality semantic representations. Nonetheless, LLM-based models, without domain-specific fine-tuning, demonstrate comparable performance, suggesting their strong potential in automated trace link recovery. Additionally, Random Forest outperforms XGBoost in both evaluation scenarios. This comparative study provides practical insights into designing robust LLM-enhanced traceability support systems for requirements engineering in modern software development environments. We introduce a hybrid approach that integrates traditional IR models, static and contextual embeddings (including LLM-based representations), along with both textual and non-textual features within a supervised classification framework. Future work may focus on fine-tuning LLMs for domain-specific contexts, enriching the feature space with additional development artifacts, and exploring prompt-based or interactive trace inference. Investigating lightweight deployment strategies and alternative classifiers also presents promising directions for practical and scalable use.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectThis thesis investigates LLM-based semantic representations for automated trace link recovery between issues and commits. It compares LLM embeddings with IR methods, static embeddings, and BERT-based models, analyzing textual and non-textual features using Random Forest and XGBoost on eight open-source projects.
dc.titleExploring LLM-Based Semantic Representations in a Hybrid Approach for Automated Trace Link Recovery
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsRequirement Traceability, Trace Link Recovery, Issue-commit Link Recovery, Large Language Models, Machine Learning
dc.subject.courseuuBusiness Informatics
dc.thesis.id53073


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record