View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        Discourse-Weighted Sentiment Analysis: Measuring the Impact of EDU Depth on Review Classification

        Thumbnail
        View/Open
        Final_thesis_discourse_weighted_sentiment_analysis-FC_v2.pdf (990.3Kb)
        Publication date
        2024
        Author
        Bexiga Moreira de Carvalho, Filipe
        Metadata
        Show full item record
        Summary
        This study investigates how rhetorical complexity affects sentiment analysis accuracy in online reviews through integration of Rhetorical Structure Theory (RST) with lexicon-based approaches. While traditional sentiment analysis tools effectively identify opinion-bearing words, they often struggle with complex discourse structures that modulate sentiment expression. By examining 12,993 online reviews across three different domains, this research explores the relationship between Elementary Discourse Unit (EDU) depth and sentiment classification improvement. The study implements two exponential weighting schemes based on EDU depth to recalibrate sentiment scores. Results demonstrate a strong linear correlation between discourse tree depth and classification accuracy improvement (r = 0.983, p < 0.001), with deeper structures showing up to 50% enhanced performance. Analysis reveals distinct improvement patterns between different misclassification types: reviews with positive star ratings but negative sentiment scores showed superior improvement (35-40%) compared to those with negative star ratings but positive sentiment scores (20-25%). Domain-specific variations emerged, with food reviews demonstrating the strongest correlation between depth and improvement. These findings advance our understanding of how rhetorical structure influences sentiment expression while highlighting the need for sophisticated analytical approaches that account for discourse complexity in automated sentiment analysis.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/48263
        Collections
        • Theses
        Utrecht university logo