View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        Automatic Classification of Legal Violations in Cookie Banner Texts

        Thumbnail
        View/Open
        Thesis__Cookie_Banners__Marieke_van_Hofslot.pdf (661.5Kb)
        Publication date
        2023
        Author
        Hofslot, Marieke van
        Metadata
        Show full item record
        Summary
        Cookie banners are designed to request consent from website visitors for their personal data. Recent research suggest that a high percentage of cookie banners violate legal regulations as defined by the General Data Protection Regulation (GDPR) and the ePrivacy Directive. In this paper, we focus on language used in these cookie banners, and whether these legal violations can be automatically detected. We make use of a small cookie banner dataset that is annotated by five experts for legal violations and test it with state-of-the-art classification models, namely BERT, LEGAL-BERT, BART in a zero-shot setting, and BERT with LIWC embeddings. Our results show that none of the models outperform the others in all classes, but in general, BERT and LEGAL-BERT provide the highest accuracy results (70%-97%). However, even these best performing models are influenced by the the unbalanced distributions in the dataset.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/43468
        Collections
        • Theses
        Utrecht university logo