View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        Ensemble of Code Tables

        Thumbnail
        View/Open
        ICA_3754022.pdf (988.4Kb)
        Publication date
        2018
        Author
        Singh, J.
        Metadata
        Show full item record
        Summary
        In this master thesis non-disjoint clustering algorithms are presented, which are based on the Minimum Description Length (MDL) principle. The algorithms capture the underlying distribution from different perspectives by compressing the data using a series of code tables. A cover algorithm describes how to compress the database using a code table. Every code table is iteratively grown until compression does not improve any more. Experiments show that the algorithms are able to identify structure in the data because the data gets compressed to some extent by the code tables. Clustering experiments show that the general structure is captured by all obtained code tables and that the different groups of patterns that are dissimilar to the general patterns, are captured by different code tables. This confirms that the code tables view the data from different perspectives. The classification experiments show that, given the class labels, the code tables are dissimilar enough to capture the different characteristics of the classes. Without the class labels it is able to find the difference between the classes when the support is sufficiently low. It is also possible to identify multi-valued dependencies in the data. This is the case when code tables in a single iteration are anti-chains and later end up in the same code table.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/30523
        Collections
        • Theses
        Utrecht university logo