View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        Learnable Fingerprints for Large Language Models

        Thumbnail
        View/Open
        Frenk Dragar MSc Thesis - Learnable Fingerprints for Large Language Models.pdf (4.858Mb)
        Publication date
        2025
        Author
        Dragar, Frenk
        Metadata
        Show full item record
        Summary
        The rapid advancement of generative artificial intelligence (AI), especially large language models (LLMs), has led to unprecedented capabilities in text generation, leading to the urgent need for the development of methods that can identify AI-generated text and prevent misuse. Techniques like watermarking that can mark text or images as being AI-generated are currently being explored in the field but are in their infancy, and are especially challenging for textual output. This thesis focuses on model fingerprinting techniques, i.e. methods that embed fingerprints into a deep generative model, used for identification of models via prompting, and can also be used to authenticate the origin of AI-generated text. We propose a fine-tuning-based method to embed learnable fingerprints within LLMs, enabling black-box model authentication without requiring access to model parameters. We evaluate it for several desirable properties of fingerprints, such as maintenance of generated text quality, and robustness against attacks. Our experiments show that model quality is maintained, even with quantization, but fingerprints are susceptible to removal via fine-tuning and are not immune from being detected via data leakage. Additionally, we experiment with combining model fingerprints and common watermarking methods that embed signatures into the generated text, and evaluate which watermarking paradigms can be used in combination with model fingerprinting. Our motivation is to provide first insights into the potential of combining the strengths of both techniques for broader purposes and application to AI regulation, trustworthiness, detection, and authentication.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/48360
        Collections
        • Theses
        Utrecht university logo