Discourse-Weighted Sentiment Analysis: Measuring the Impact of EDU Depth on Review Classification
Summary
This study investigates how rhetorical complexity affects sentiment analysis accuracy in online reviews through integration of Rhetorical Structure Theory (RST) with lexicon-based approaches. While traditional sentiment analysis tools effectively identify opinion-bearing words, they often struggle with complex discourse structures that modulate sentiment expression. By examining 12,993 online reviews across three different domains, this research explores the relationship between Elementary Discourse Unit (EDU) depth and sentiment classification improvement. The study implements two exponential weighting schemes based on EDU depth to recalibrate sentiment scores. Results demonstrate a strong linear correlation between discourse tree depth and classification accuracy improvement (r = 0.983, p < 0.001), with deeper structures showing up to 50% enhanced performance. Analysis reveals distinct improvement patterns between different misclassification types: reviews with positive star ratings but negative sentiment scores showed superior improvement (35-40%) compared to those with negative star ratings but positive sentiment scores (20-25%). Domain-specific variations emerged, with food reviews demonstrating the strongest correlation between depth and improvement. These findings advance our understanding of how rhetorical structure influences sentiment expression while highlighting the need for sophisticated analytical approaches that account for discourse complexity in automated sentiment analysis.