Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorWerf, J.M.E.M. van der
dc.contributor.authorDabroek, M.G.
dc.date.accessioned2016-11-17T18:00:36Z
dc.date.available2016-11-17T18:00:36Z
dc.date.issued2016
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/24806
dc.description.abstractIn the current information age, it is crucial for an organisation to integrate all of its available source systems to provide deep insights, adhere to regulations, or provide a competitive edge. However, data integration often proves to be a tedious and costly process. In this study we aim to answer the question: can we formulate and construct a semi-automatic distributed system to enable scalable and reuse-oriented data integration? To aid the process of data integration we propose a system that takes advantage of previously provided associations between source schemas. To provide a common language between disparate sources, we introduce the use of an ontology to our proposed solution. We attempt to semi-automatically match attributed from the schemas of source systems to entities of the ontology, by utilising a self-learning aspect and a feedback loop. We attempt to achieve this by applying a dual approach, using both semantical and structural aspects of the source and the ontology. We constructed a proof-of-concept and performed a user acceptance study to evaluate our approach and validate our solution. Our contribution is two-fold: we distribute our data integration system, thereby contributing to the scalability of the system, and we reuse previously obtained results to enable semi-automatic matching. We performed both quantitative and qualitative analysis to evaluate the accuracy and feasibility of our system. The outcome suggests that our solution has merit and shows that end-users have positive expectations towards its use and performance.
dc.description.sponsorshipUtrecht University
dc.format.extent3600810
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.titleScalable and Reuse-Oriented Data Integration: A Distributed Semi-Automatic Approach
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsdata integration; semantic integration; scalable systems; reuse; reference architecture; schema matching; ontology; Elasticsearch; UTAUT;
dc.subject.courseuuBusiness Informatics


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record