원문정보
초록
영어
Frequent Itemsets Mining(FIM) is a typical data mining task and has gained much attention. Due to the consideration of individual privacy, various studies have been focusing on privacy-preserving FIM problems. Differential privacy has emerged as a promising scheme for protecting individual privacy in data mining against adversaries with arbitrary background knowledge. In this paper, we present an approach to exploring frequent itemsets under rigorous differential privacy model, a recently introduced definition which provides rigorous privacy guarantees in the presence of arbitrary external information. The main idea of differentially privacy FIM is perturbing the support of item which can hide changes caused by absence of any single item. The key observation is that pruning the number of unpromising candidate items can effectively reduce noise added in differential privacy mechanism, which can bring about a better tradeoff between utility and privacy of the result. In order to effectively remove the unpromising items from each candidate set, we use a progressive sampling method to get a super set of frequent items, which is usually much smaller than the original item database. Then the sampled set will be used to shrink candidate set. Extensive experiments on real data sets illustrate that our algorithm can greatly reduce the noise scale injected and output frequent itemsets with high accuracy while satisfying differential privacy.
목차
1. Introduction
2. Related Works
3. Preliminaries
3.1. Differential Privacy
3.2. Frequent Itemsets Mining
4. Candidate Pruning-based FIM
4.1. A Straight Forward Approach
4.2. Progressive Sampling
4.3. Candidate Pruning-Based FIM
4.4. Privacy Analysis
5. Experiments
5.1. Experimental Settings
5.2. Competing Algorithms
6. Conclusion
Acknowledgements
References