Large Language Models as Mirrors of Societal Moral Standards

Santos Moitinho de Almeida, Lourenço

View/Open

MSc_Thesis___Final_Submission.pdf (1.597Mb)

Publication date

2024

Author

Santos Moitinho de Almeida, Lourenço

Metadata

Show full item record

Summary

Cross-cultural moral variation has become evi- dent throughout social media. Since the emer- gence of large language models (LLMs), the ethical implications of these discrepancies has grown in significance. In spite of all their ca- pabilities, these models are often criticized for their undesirable or even controversial output. Consequently, fields such as explainable (XAI) NLP have emerged in order to address the dilemma. Although moral variation has been examined in past research, the predominant methodology tends to focus on a broader per- spective that may overlook subtle differences. For these reasons, this study aims to fill the re- search gap by investigating cross-cultural moral variation with an emphasis on local explainabil- ity across four mono- and multi-lingual LLMs. Through language model probing, SHapley Ad- ditive exPlanations (SHAP) and an ethical val- ues dataset gathered from the World Values Survey (WVS), a fine-grained analysis was conducted. This study introduces the ’SHAP Logprob’ model that was built for token-level interpretations. Lastly, this study address the challenges and limitations of interpreting cross- cultural moral variation through SHAP.

URI

https://studenttheses.uu.nl/handle/20.500.12932/47465

Collections

Theses