Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorNguyen, Dong
dc.contributor.authorTessels, Lisa
dc.date.accessioned2024-07-24T23:04:28Z
dc.date.available2024-07-24T23:04:28Z
dc.date.issued2024
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/46875
dc.description.abstractThis paper aimed to determine the most effective classifier for identifying registered 'caring communities' using data from the Dutch Chamber of Commerce. I optimized and assessed the performance of four classifiers: Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), and Gradient Boosting Tree (GBDT). The results show that LR consistently outperformed the other models across 2022 and 2023 test sets, excelling across all evaluation metrics. While GBDT showed competitive performance, SVM and RF were less effective. Despite LR's strengths, improvements in recall and data quality are essential for better identification of caring communities. Without these improvements, the algorithm may underestimate the total number of caring communities, leading to an incomplete understanding of their prevalence.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectI determined the most effective classifier for identifying registered caring communities using data from the Dutch Chamber of Commerce. I optimized and assessed the performance of four classifiers: Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), and Gradient Boosting Tree (GBDT).
dc.titleIdentifying Caring Communities Within Dutch Chamber Of Commerce Data: A Classifier Comparison
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsMachine learning, Classification, Caring Communities
dc.subject.courseuuApplied Data Science
dc.thesis.id34890


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record