Provable Privacy for Database Generation: an Information Theoretic Approach

Dorrestijn, J.E.G.

View/Open

paper.pdf (333.5Kb)

Publication date

2018

Author

Dorrestijn, J.E.G.

Metadata

Show full item record

Summary

Many methods exist to avoid disclosing sensitive information when releasing a database. However these methods either cannot guarantee that the information of individuals is secure or are aimed at specific use cases. In this paper we develop a method which is both provably private and retains the overall form of the original database. To achieve this we derive a privacy measure, epsilon-dependence. Intuitively, epsilon-dependence requires that the input and output databases are nearly independent. We show that epsilon-dependence can be seen as an information theoretic refinement of differential privacy. We then adapt the KRIMP algorithm to generate databases while satisfying epsilon-dependence. We show through experiments that the generated databases are comparable to the original databases when performing machine learning or itemset mining tasks. The results are especially good on larger databases.

URI

https://studenttheses.uu.nl/handle/20.500.12932/30524

Collections

Theses