Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorFeelders, A.
dc.contributor.authorTrompper, M.F.A.
dc.date.accessioned2016-09-19T17:00:39Z
dc.date.available2016-09-19T17:00:39Z
dc.date.issued2016
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/24346
dc.description.abstractA growing amount of Dutch case law is openly distributed on Rechtspraak.nl. Currently, many documents are not marked up or marked up only very sparsely, hampering our ability to process these documents automatically. In this thesis, we explore the problem of automatic assignment of a section structure to the texts of Dutch court judgments. To this end, we develop a database that mirrors the XML data offering of Rechtspraak.nl. We experiment with Linear-Chain Conditional Random Fields to label text elements with their roles in the document (text, title or numbering). Given a list of labels, we experiment with Probabilistic Context-Free Grammars to generate a parse tree which represents the section hierarchy of a document. We report F1 scores of around 0.91 for tagging section titles (around 1.0 for other types) and 0.92 for parsing the tokens into a section hierarchy.
dc.description.sponsorshipUtrecht University
dc.format.extent10805667
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.titleAutomatic Assignment of Section Structure to Texts of Dutch Court Judgments
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsConditional Random Fields;Probabilistic Context Free Grammars;automatic markup;court judgments
dc.subject.courseuuCognitive Artificial Intelligence


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record