Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorFeelders, Ad
dc.contributor.authorKuppevelt, D.E. van
dc.date.accessioned2014-02-18T18:00:33Z
dc.date.available2014-02-18T18:00:33Z
dc.date.issued2014
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/16147
dc.description.abstractWith a growing portion of the web dedicated to the discussion, review and retail of consumer products, it is increasingly relevant to develop methods for automated extraction of Product Entities in user generated text. In addition, it is important that the extraction models provide feedback about the quality of their output, in the form of a confidence score associated with each entity. Conditional Random Fields, which are designed as a discriminative solution for structured output prediction, have shown to be successful for the related problem of Named Entity Recognition. Furthermore, their probabilistic nature provides a natural way to obtain a confidence score. In this thesis, the optimal application of Conditional Random Fields to the specific problem of identifying Product Entities is investigated. A set of experiments is designed and executed to compare different choices of feature sets. The results prove that Conditional Random Fields perform better than heuristic models for this task. In addition, several existing methods for confidence scoring are experimentally compared, and an optimized algorithm to calculate the exact confidence estimate (known as the Constrained Forward Backward estimate) is introduced. The experiments show that the more heuristic Gamma Product method has a comparable performance to the Constrained Forward Backward method, and thus provides an alternative confidence estimate to use in practice. Finally, the F1 score for the Product Entity Recognition is enhanced even further by combining models, using the confidence score for voting.
dc.description.sponsorshipUtrecht University
dc.format.extent494409
dc.format.mimetypeapplication/pdf
dc.language.isoen_US
dc.titleIdentifying Product Entities in text with Conditional Random Fields
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsNamed Entity Recognition, Conditional Random Fields
dc.subject.courseuuComputing Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record