View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        Automatic Assignment of Section Structure to Texts of Dutch Court Judgments

        Thumbnail
        View/Open
        thesis-2016-09-06.pdf (10.30Mb)
        Publication date
        2016
        Author
        Trompper, M.F.A.
        Metadata
        Show full item record
        Summary
        A growing amount of Dutch case law is openly distributed on Rechtspraak.nl. Currently, many documents are not marked up or marked up only very sparsely, hampering our ability to process these documents automatically. In this thesis, we explore the problem of automatic assignment of a section structure to the texts of Dutch court judgments. To this end, we develop a database that mirrors the XML data offering of Rechtspraak.nl. We experiment with Linear-Chain Conditional Random Fields to label text elements with their roles in the document (text, title or numbering). Given a list of labels, we experiment with Probabilistic Context-Free Grammars to generate a parse tree which represents the section hierarchy of a document. We report F1 scores of around 0.91 for tagging section titles (around 1.0 for other types) and 0.92 for parsing the tokens into a section hierarchy.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/24346
        Collections
        • Theses
        Utrecht university logo