View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        On the influence of dataset characteristics on classifier performance

        Thumbnail
        View/Open
        T. van Gemert - On the influence of meta-features on classifier performance .pdf (191.4Kb)
        Publication date
        2017
        Author
        Gemert, T. van
        Metadata
        Show full item record
        Summary
        The field of Machine Learning has been rapidly gaining attention from both academic and commercial parties. To promote fast deployement of analytical solutions, several tools have been developed to aid the novice user. Concurrently, fields like meta-learning have been making great progress in developing models of algorithm performance on different datasets. One of the central issues in Machine Learning, for both novices and experts, is what learning algorithm to use on a given dataset. Although many solutions have been proposed, a definitive solution has yet to be found. We will argue that a possible solution lies in a deeper understanding of the data we are dealing with. By characterizing datasets in terms of meta-features such as the size of the dataset, we can compare and discuss different datasets and relate them to algorithm performance. A better empirical and analytical understanding of the data may also improve algorithm development, cause significant time-savings and present new insights. Focussing on classification algorithms, we present a number of ways in which meta-features can contribute to machine learning research. We will discuss several challenges and guidelines that have been proposed in the relevant literature and lastly we present what little is known about several meta-features and their relation to a classifier's performance.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/26150
        Collections
        • Theses
        Utrecht university logo