Microdata Privacy Protection Through Permutation-Based Approaches
No Thumbnail Available
Files
Date
2008-03-25
Authors
Journal Title
Series/Report No.
Journal ISSN
Volume Title
Publisher
Abstract
Data analysts often prefer access to data in the form of original tuples(i.e., microdata), instead of pre-aggregated statistics, since the former offers advantages in information flexibility and availability. Two problems should be addressed before releasing microdata. First, individual's privacy needs to be adequately protected. In general, the data will be anonymized before sharing. Second, the utility of the anonymized microdata should be maintained and common aggregate queries should be answered with reasonable accuracy.
Most existing works on microdata anonymization are based on attribute generalization. Though popular, these approaches have limitations: the generalization of attributes make it difficult to answer typical aggregate queries with reasonable accuracy.
This dissertation investigates new techniques to address the limitations of existing approaches.
We propose to anonymize microdata through permutation-based approaches. In particular, we first extend existing privacy goals to better fit the protection requirement of numerical data, and develop a scheme to achieve this privacy goal through sensitive attribute permutation. Second, we propose a stronger privacy goal where an attacker can only learn from the microdata that an individual's sensitive attribute follows a pre-specified target distribution, but nothing more. We combine sensitive attribute permutation and generalization techniques to achieve this goal. To get better query answers when the target distribution is far from that of the original microdata, we further provide mechanisms to allow users to better control the tradeoff between privacy and accuracy. Third, we extend our techniques to anonymize graph data and support the accurate answering of queries that involve graph properties. Specifically, we partition the nodes and relabel (a form of permutation) the nodes within the same partition. Finally, we study anonymization techniques that can support personalized privacy, which allows individuals to flexibly control the privacy protection they desire.
Description
Keywords
microdata, privacy, security, anonymization, permutation
Citation
Degree
PhD
Discipline
Operations Research
Computer Science
Computer Science