View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        Could you please ClAIrify?: A clustering based framework for machine learning model evaluation

        Thumbnail
        View/Open
        MTHESIS___ClAIrify2.pdf (1.998Mb)
        Publication date
        2024
        Author
        Kirschstein Schäfer, Oscar
        Metadata
        Show full item record
        Summary
        Machine Learning and Visual Analytics have received increasing attention in recent years both in research and production. These two fields overlap on the evaluation of model performance. Classical evaluation metrics have limited explainability, and do not use the visualisation potential available nowadays, nor are they able to give insights on how models perform on different types of sub-groups of the input data. Our goal for this project is to present a data-, task-, and model-agnostic framework to overcome these limitations of classical evaluation metrics, capable of generating visual model performance profiles over clusters of the evaluation data of a machine learning problem. % Clustering is used to generate groups of similar input data points, potentially recognisable by machine learning engineers or other individuals in the machine learning model building pipeline.%, to be acted upon accordingly, given the cluster and performance pairs for an array of models. After laying out the building blocks of this conceptual framework in the form of guidelines, we implement it as a web app for demonstration and validation. Finally, we go on to validate the quality of the framework through a Think Aloud Protocol study, a survey and two use cases on time series data sets. Thematic analysis is then performed on the transcripts resulting from the study. The results point towards the viability of using our conceptual framework for understanding, evaluation and comparison of machine learning model performance.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/46014
        Collections
        • Theses
        Utrecht university logo