Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorBloothooft, Gerrit
dc.contributor.authorJoosse, C.W.
dc.date.accessioned2018-09-04T17:00:56Z
dc.date.available2018-09-04T17:00:56Z
dc.date.issued2018
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/30912
dc.description.abstractEdit distance is often used in record linkage for real persons to express the similarity of two names. In historical data names often have high spelling variance. This study investigates a method to deal with high name spelling variance by using overlinking and ?ltering in order to generate matches on a dataset of historical civil registrations. The method tries to build sets of registrations of persons that belong to the same family by applying real world knowledge to the generated matches. When using the four names of the parents mentioned on the registrations and an edit distance of 4 and 5, 80% to 85% of the generated matches are consistent with real world knowledge.
dc.description.sponsorshipUtrecht University
dc.format.extent1408353
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.titleReconstructing families
dc.type.contentBachelor Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsEnitity resolution, name matching, Edit distance, record linkage
dc.subject.courseuuKunstmatige Intelligentie


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record