Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorFarshidi, S.
dc.contributor.authorFeng, Liang Feng
dc.date.accessioned2025-09-03T23:02:30Z
dc.date.available2025-09-03T23:02:30Z
dc.date.issued2025
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/50316
dc.description.abstractSoftware developers actively contribute their reviews of software components within various software communities, including platforms such as Stack Overflow. Analyzing these user reviews can provide valuable insight into the strengths and weaknesses of software components. The objective of this study is to explore implementation strategies for the application of sentiment analysis tools in specific domains, with a particular emphasis on the software engineering domain. As well as to investigate approaches for evaluating domain-specific sentiment analysis methods and techniques. The key components of this study include constructing a comprehensive software quality keyword list for effective mapping and implementing accurate sentiment analysis algorithms. In order to evaluate the effectiveness of the models, an evaluation dataset containing 4500 data was created by leveraging advanced pre-trained models to generate sentiment labels. Finally, a dataset comprising over 50,000 relevant reviews was extracted from Stack Overflow, G2, and TrustRadius. After conducting a comparative analysis of four state-of-the-art models in this domain, such as Senti4SD, TextCNN, SentiStrength-SE, and RoBERTa, it was observed that the RoBERTa model outperforms the others in various metrics. Therefore, the RoBERTa model was selected as the sentiment analysis model for this study. On the evaluation dataset, the RoBERTa model demonstrated a weighted average precision of 0.91 and a weighted average f-1 score of 0.78.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectIn this research paper, our primary focus is the application of sentiment analysis techniques to review data in the domain of software engineering. Given m (m≥1) reviews as input, we get m sets of labels L = (lSC,lQA,lSP) of m reviews, where lSC denotes the Software Component label, lQA denotes the Quality Attribute label, lSP denotes the Sentiment Polarity label.
dc.titleSoftware Quality Assessment of Developer Community Reviews Using Sentiment Analysis Techniques
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsSentiment Analysis; Software Quality Attribute; Software Engineering; Keywords Extraction.
dc.subject.courseuuComputing Science
dc.thesis.id53592


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record