dc.rights.license | CC-BY-NC-ND | |
dc.contributor | Michael Behrisch, Yuncong Yu, Alex Telea | |
dc.contributor.advisor | Behrisch, Michael | |
dc.contributor.author | Kirschstein Schäfer, Oscar | |
dc.date.accessioned | 2024-02-15T15:06:29Z | |
dc.date.available | 2024-02-15T15:06:29Z | |
dc.date.issued | 2024 | |
dc.identifier.uri | https://studenttheses.uu.nl/handle/20.500.12932/46014 | |
dc.description.abstract | Machine Learning and Visual Analytics have received increasing attention in recent years both in research and production.
These two fields overlap on the evaluation of model performance. Classical evaluation metrics have limited explainability, and do not use the visualisation potential available nowadays, nor are they able to give insights on how models perform on different types of sub-groups of the input data.
Our goal for this project is to present a data-, task-, and model-agnostic framework to overcome these limitations of classical evaluation metrics, capable of generating visual model performance profiles over clusters of the evaluation data of a machine learning problem.
% Clustering is used to generate groups of similar input data points, potentially recognisable by machine learning engineers or other individuals in the machine learning model building pipeline.%, to be acted upon accordingly, given the cluster and performance pairs for an array of models.
After laying out the building blocks of this conceptual framework in the form of guidelines, we implement it as a web app for demonstration and validation.
Finally, we go on to validate the quality of the framework through a Think Aloud Protocol study, a survey and two use cases on time series data sets.
Thematic analysis is then performed on the transcripts resulting from the study.
The results point towards the viability of using our conceptual framework for understanding, evaluation and comparison of machine learning model performance. | |
dc.description.sponsorship | Utrecht University | |
dc.language.iso | EN | |
dc.subject | We developed a web-based tool that visually showcases the efficacy of machine learning models by grouping similar data. We tested its usability through various methods, including a "Think Aloud Protocol." The tests suggest it's a useful tool for understanding model performance. | |
dc.title | Could you please ClAIrify?: A clustering based framework for machine learning model evaluation | |
dc.type.content | Master Thesis | |
dc.rights.accessrights | Open Access | |
dc.subject.courseuu | Artificial Intelligence | |
dc.thesis.id | 23033 | |