View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        Identification and interpretation of heterogeneous sparse conditional independence structures using Gaussian Mixture Modelling and Gaussian Graphical Modelling

        Thumbnail
        View/Open
        Thesis - Vladimir Hazeleger - Final version.pdf (17.87Mb)
        Publication date
        2020
        Author
        Hazeleger, V.
        Metadata
        Show full item record
        Summary
        Humans, and especially researchers, often find themselves trying to categorise things in order to better understand them. Take, for example, different subtypes of cancer, mental illness, personality, political affiliation, or opinions. In cases where it is unknown what kind of subtypes exist, identifying subtypes is a challenge. In the fields of artificial intelligence and computational data science, this challenge can be addressed with unsupervised learning, also known as clustering. Clustering methods divide data into subgroups, based on features. However, from this division, it is often difficult to infer how or why subgroups are different. For this thesis, I address this shortcoming by combining a clustering method – Gaussian Mixture Modelling – with a structural estimation method – Gaussian Graphical Modelling. Structural estimation methods reveal relations between variables and are visualised as networks that can be analysed and interpreted. The combined method divides the data based on the structure of these networks. By comparing the network structures, we infer structural differences between subgroups. The method is first tested on artificial data, showing that it is sensitive to low sample sizes. Then, the method is applied to two datasets: data on Social Media Disorder among Dutch Adolescents, and European data on the public opinion of immigrants and refugees. Results for the Social Media Disorder data show three slightly different subtypes. Results for the public opinion data suggest three clear, distinguishable subtypes.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/36416
        Collections
        • Theses
        Utrecht university logo