Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributorSofoklis Kitharidis
dc.contributor.advisorBosch, Antal van den
dc.contributor.authorKitharidis, Sofoklis
dc.date.accessioned2023-08-11T00:02:09Z
dc.date.available2023-08-11T00:02:09Z
dc.date.issued2023
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/44626
dc.description.abstractThis thesis investigates the application of unsupervised learning algo- rithms, namely KMeans, Latent Dirichlet Allocation (LDA), BERTopic, and Hierarchical clustering to analyze customer complaint data in the banking sector. The research aims to uncover patterns, topics, and insights from the complaints to enhance customer satisfaction strategies. The problem statement revolves around understanding the impact of dif- ferent natural language processing methods on the comprehension of fi- nancial complaint data and their comparative performance. The key re- search question addresses how various NLP methods influence the under- standing of financial complaint data and how these methods can be compared. To address this question, the study utilizes four unsupervised learning algorithms: KMeans, LDA, BERTopic, and Hierarchical clustering. KMeans is employed with Word2Vec, Doc2vec, TF-IDF and BERT embeddings, while LDA is applied using Bag of Words, TF-IDF, and Word2Vec repre- sentations. BERTopic with DBSCAN and hierarchical clustering algorithm is also explored with Word2Vec, Doc2vec, TF-IDF and BERT embeddings. The analysis reveals significant findings, including the identification of key topics in the customer complaints dataset and the comparison of different clustering approaches. The results demonstrate that KMeans with Word2Vec embeddings achieves the highest cluster separation and density, indicating its superior performance. LDA highlights relevant topics related to loans, payments, communication, debt, and banking services. BERTopic with DBSCAN demonstrates improved cluster separation and provides precise and distinctive topics. In summary, this research provides valuable insights into the understand- ing of financial complaint data using unsupervised learning algorithms. The findings contribute to the development of customer satisfaction im- provement strategies in the banking industry. Last but not least, the study addresses ethical considerations, such as privacy and data integrity, ensuring responsible research practices throughout the analysis.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectComparative Analysis of Unsupervised Learning Techniques for Topic Extraction in Bank Complaints
dc.titleComparative Analysis of Unsupervised Learning Techniques for Topic Extraction in Bank Complaints
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsUnsupervised Learning Algorithms; KMeans ;Latent Dirichlet Allocation (LDA); BERTopic; Hierarchical Clustering;Customer Complaint Data;Banking Sector;Patterns; Topics; Insights;Customer Satisfaction Strategies; Natural Language Processing (NLP); Financial Complaint Data; Comparative Performance; Research Question; Word2Vec;Doc2Vec; TF-IDF;BERT Embeddings;Bag of Words;DBSCAN;Cluster Separation; Cluster Density; Ethical Considerations;
dc.subject.courseuuApplied Data Science
dc.thesis.id21627


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record