Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorBloothooft, G.
dc.contributor.advisorFeelders, A.J.
dc.contributor.authorOosterlaken, R.A.J.
dc.date.accessioned2018-08-03T17:01:32Z
dc.date.available2018-08-03T17:01:32Z
dc.date.issued2018
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/30117
dc.description.abstractIn the process of automated record linkage, dealing with name variation is often done via limited means, such as an edit distance plus a threshold value. However, names vary in ways that default similarity measures can not reliably coped with. In an effort to overcome this threshold, an alternative, 'weighted' edit distance is proposed. This weighted edit distance would assign costs to operations based on previously seen operations that transform names into their known variants. Names often vary in similar ways, by adding the same suffixes, to name an example. Operations that transform names into their name variants are therefore likely to be similar to the operations that would be seen between names and their yet unseen name variants. In this paper, methods are defined that gather the data required to create a cost model that assigns costs for the operations of a weighted edit distance. Suggestions were then given on how to implement a cost model and a weighted edit distance based on this data, as well as how to test these implementations.
dc.description.sponsorshipUtrecht University
dc.format.extent321546
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.titleIdentifying Historical Person Names using Weighted Edit Distance
dc.type.contentBachelor Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsonomastics; record linkage; weighted edit distance; dynamic costs
dc.subject.courseuuKunstmatige Intelligentie


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record