Software Quality Assessment of Developer Community Reviews Using Sentiment Analysis Techniques

Feng, Liang Feng

dc.rights.license	CC-BY-NC-ND
dc.contributor.advisor	Farshidi, S.
dc.contributor.author	Feng, Liang Feng
dc.date.accessioned	2025-09-03T23:02:30Z
dc.date.available	2025-09-03T23:02:30Z
dc.date.issued	2025
dc.identifier.uri	https://studenttheses.uu.nl/handle/20.500.12932/50316
dc.description.abstract	Software developers actively contribute their reviews of software components within various software communities, including platforms such as Stack Overflow. Analyzing these user reviews can provide valuable insight into the strengths and weaknesses of software components. The objective of this study is to explore implementation strategies for the application of sentiment analysis tools in specific domains, with a particular emphasis on the software engineering domain. As well as to investigate approaches for evaluating domain-specific sentiment analysis methods and techniques. The key components of this study include constructing a comprehensive software quality keyword list for effective mapping and implementing accurate sentiment analysis algorithms. In order to evaluate the effectiveness of the models, an evaluation dataset containing 4500 data was created by leveraging advanced pre-trained models to generate sentiment labels. Finally, a dataset comprising over 50,000 relevant reviews was extracted from Stack Overflow, G2, and TrustRadius. After conducting a comparative analysis of four state-of-the-art models in this domain, such as Senti4SD, TextCNN, SentiStrength-SE, and RoBERTa, it was observed that the RoBERTa model outperforms the others in various metrics. Therefore, the RoBERTa model was selected as the sentiment analysis model for this study. On the evaluation dataset, the RoBERTa model demonstrated a weighted average precision of 0.91 and a weighted average f-1 score of 0.78.
dc.description.sponsorship	Utrecht University
dc.language.iso	EN
dc.subject	In this research paper, our primary focus is the application of sentiment analysis techniques to review data in the domain of software engineering. Given m (m≥1) reviews as input, we get m sets of labels L = (lSC,lQA,lSP) of m reviews, where lSC denotes the Software Component label, lQA denotes the Quality Attribute label, lSP denotes the Sentiment Polarity label.
dc.title	Software Quality Assessment of Developer Community Reviews Using Sentiment Analysis Techniques
dc.type.content	Master Thesis
dc.rights.accessrights	Open Access
dc.subject.keywords	Sentiment Analysis; Software Quality Attribute; Software Engineering; Keywords Extraction.
dc.subject.courseuu	Computing Science
dc.thesis.id	53592

Files in this item

Name:: 381CFEBE-00FC-4560-9E3A-9FB035 ...
Size:: 2.495Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Theses

Show simple item record