Emotion Recognition in Queer Oral History Archives: The Challenges of Measuring Performance and Queer Bias in Affective Computing
Summary
This thesis investigates the potential and challenges of employing automated text-based emotion recognition techniques, particularly Large Language Models (LLMs), on queer oral history archives. While these archives offer rich and nuanced accounts of lived experiences, analysing them for emotional content presents unique complexities. A key challenge in this domain is the lack of a reliable ground truth for emotion labels. Attempts to establish such a ground truth through semi-automated and fully manual annotation proved unsuccessful due to low inter-rater reliability, highlighting the inherently subjective nature of emotion perception and the difficulties in achieving consensus even among human annotators. Moreover, queer keyword-based analysis showed evidence of bias in all studied models: EmoLex, a pretrained DistilRoBERTa and zero-shot GPT-4o and GPT-4o Mini. Further analysis into GPT-4o Mini showed additional biases in its text generation, with a tendency to associate queer identities with negative emotions or stereotype particular identities. This bias, if unaddressed, can lead to inaccurate and potentially harmful interpretations of the emotions expressed within these invaluable historical records. The results of this research underscore the need for more robust and context-aware annotation procedures that account for the nuances of individual storytelling and the diverse ways emotions are conveyed in oral histories. Following this, oral history archives could serve as a good source of data to ensure that minority groups are represented in emotion detection models, thereby mitigating future bias.