View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        How can advancements in the data science space such as vectorization, graph traversals and probabilistic record linkage, enhance entity resolution in multi-source data environments?

        Thumbnail
        View/Open
        Veips_Filips_Applied_Data_Science_masters_thesis_V2.pdf (844.3Kb)
        Publication date
        2024
        Author
        Veips, Filips
        Metadata
        Show full item record
        Summary
        Entity matching is an essential field of study in terms of working with data. Effective coinciding of the entities with each other can significantly increase the effective output out of the data. The main problems with entity matching comes from 2 sources: the flaws of the data and the the matching effectiveness. This research is dedicated to proceeding through these problems in order to state an effective entity resolution algorithm capable of dealing with real-world data. We have constructed four different entity matching models: probabilistic model SPLINK, machine learning models logistic regression, support vector machines and BERT-based transformer. All the models were applied to the same data which was preprocessed accordingly. SPLINK model showed the best result and can be used in similar tasks in the future. However, it is worth mentioning the performance of other models is also quite optimistic and their usage can be viable.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/47779
        Collections
        • Theses
        Utrecht university logo