White-box passive attacks and robust model learning defenses on image classification models

Blom, Frederieke

dc.rights.license	CC-BY-NC-ND
dc.contributor	Muriel van der Spek, Henri Bouma, Albert A. Salah
dc.contributor.advisor	Salah, Albert
dc.contributor.author	Blom, Frederieke
dc.date.accessioned	2025-08-21T00:06:10Z
dc.date.available	2025-08-21T00:06:10Z
dc.date.issued	2025
dc.identifier.uri	https://studenttheses.uu.nl/handle/20.500.12932/49895
dc.description.abstract	Federated learning allows multiple parties to collaboratively train a model without sharing sensitive data, making it suitable for applications such as border control, criminal investigation, and device security. However, models trained in this way remain vulnerable to privacy attacks. Two notable threats are membership inference attacks (MIAs), which aim to determine whether specific samples were included in the training data, and model inversion (MI) attacks, which attempt to reconstruct training samples from the model. State-of-the-art defenses for MI include transfer learning (TL) and bidirectional dependency optimization (BiDO). For MIA, random cropping (RC) has shown a strong mitigation potential. This study investigates whether RC can be applied to a model typically used to demonstrate MI attacks (without deteriorating this models test accuracy), and how effective the combination of the three defenses is against a strong MIA attack. Each defense is applied individually, in pairs, and in full combination, with parameters fine-tuned accordingly. The models are then subjected to two state-of-the-art attacks: IF-GMI as MI attack on undefended models, and LiRA as MIA attack on defended and undefended models. Each defense configuration demonstrates a reduction in data leakage, with acceptable utility and cost. The results show that the combination of multiple defenses (TL + BiDO + RC) achieves the greatest mitigation effect against MIA, without notable degradation in performance.
dc.description.sponsorship	Utrecht University
dc.language.iso	EN
dc.subject	The potential of robust model learning on image classification models as a defense against model inversion and membership inference attacks.
dc.title	White-box passive attacks and robust model learning defenses on image classification models
dc.type.content	Master Thesis
dc.rights.accessrights	Open Access
dc.subject.keywords	Federated Learning; Bidirectional Dependency Optimization; Generative Model Inversion; Likelihood Ratio Attack; Membership Inference Attack; Model Inversion Attack; Random Cropping; Transfer Learning
dc.subject.courseuu	Artificial Intelligence
dc.thesis.id	51988

Files in this item

Name:: LitRev (14).pdf
Size:: 13.32Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Theses

Show simple item record

White-box passive attacks and robust model learning defenses on image classification models

Files in this item

This item appears in the following Collection(s)

Related items

Modeling dual-task performance: do individualized models predict dual-task performance better than average models? ﻿

Modelling Wastewater Quantity and Quality in Mexico -- using an agent-based model ﻿

Modelling offshore wind in the IMAGE/TIMER model ﻿

Modeling dual-task performance: do individualized models predict dual-task performance better than average models?

Modelling Wastewater Quantity and Quality in Mexico -- using an agent-based model

Modelling offshore wind in the IMAGE/TIMER model