Provable Privacy for Database Generation: an Information Theoretic Approach
dc.rights.license | CC-BY-NC-ND | |
dc.contributor.advisor | Siebes, Dr. A.P.J.M. | |
dc.contributor.author | Dorrestijn, J.E.G. | |
dc.date.accessioned | 2018-08-24T17:00:39Z | |
dc.date.available | 2018-08-24T17:00:39Z | |
dc.date.issued | 2018 | |
dc.identifier.uri | https://studenttheses.uu.nl/handle/20.500.12932/30524 | |
dc.description.abstract | Many methods exist to avoid disclosing sensitive information when releasing a database. However these methods either cannot guarantee that the information of individuals is secure or are aimed at specific use cases. In this paper we develop a method which is both provably private and retains the overall form of the original database. To achieve this we derive a privacy measure, epsilon-dependence. Intuitively, epsilon-dependence requires that the input and output databases are nearly independent. We show that epsilon-dependence can be seen as an information theoretic refinement of differential privacy. We then adapt the KRIMP algorithm to generate databases while satisfying epsilon-dependence. We show through experiments that the generated databases are comparable to the original databases when performing machine learning or itemset mining tasks. The results are especially good on larger databases. | |
dc.description.sponsorship | Utrecht University | |
dc.format.extent | 341512 | |
dc.format.mimetype | application/pdf | |
dc.language.iso | en | |
dc.title | Provable Privacy for Database Generation: an Information Theoretic Approach | |
dc.type.content | Master Thesis | |
dc.rights.accessrights | Open Access | |
dc.subject.courseuu | Computing Science |