scholarly journals A Grid-Based Swarm Intelligence Algorithm for Privacy-Preserving Data Mining

2019 ◽  
Vol 9 (4) ◽  
pp. 774 ◽  
Author(s):  
Tsu-Yang Wu ◽  
Jerry Lin ◽  
Yuyu Zhang ◽  
Chun-Hao Chen

Privacy-preserving data mining (PPDM) has become an interesting and emerging topic in recent years because it helps hide confidential information, while allowing useful knowledge to be discovered at the same time. Data sanitization is a common way to perturb a database, and thus sensitive or confidential information can be hidden. PPDM is not a trivial task and can be concerned an Non-deterministic Polynomial-time (NP)-hard problem. Many algorithms have been studied to derive optimal solutions using the evolutionary process, although most are based on straightforward or single-objective methods used to discover the candidate transactions/items for sanitization. In this paper, we present a multi-objective algorithm using a grid-based method (called GMPSO) to find optimal solutions as candidates for sanitization. The designed GMPSO uses two strategies for updating gbest and pbest during the evolutionary process. Moreover, the pre-large concept is adapted herein to speed up the evolutionary process, and thus multiple database scans during each evolutionary process can be reduced. From the designed GMPSO, multiple Pareto solutions rather than single-objective algorithms can be derived based on Pareto dominance. In addition, the side effects of the sanitization process can be significantly reduced. Experiments have shown that the designed GMPSO achieves better side effects than the previous single-objective algorithm and the NSGA-II-based approach, and the pre-large concept can also help with speeding up the computational cost compared to the NSGA-II-based algorithm.

2014 ◽  
Vol 2014 ◽  
pp. 1-12 ◽  
Author(s):  
Chun-Wei Lin ◽  
Tzung-Pei Hong ◽  
Hung-Chuan Hsu

Data mining is traditionally adopted to retrieve and analyze knowledge from large amounts of data. Private or confidential data may be sanitized or suppressed before it is shared or published in public. Privacy preserving data mining (PPDM) has thus become an important issue in recent years. The most general way of PPDM is to sanitize the database to hide the sensitive information. In this paper, a novel hiding-missing-artificial utility (HMAU) algorithm is proposed to hide sensitive itemsets through transaction deletion. The transaction with the maximal ratio of sensitive to nonsensitive one is thus selected to be entirely deleted. Three side effects of hiding failures, missing itemsets, and artificial itemsets are considered to evaluate whether the transactions are required to be deleted for hiding sensitive itemsets. Three weights are also assigned as the importance to three factors, which can be set according to the requirement of users. Experiments are then conducted to show the performance of the proposed algorithm in execution time, number of deleted transactions, and number of side effects.


2013 ◽  
Vol 321-324 ◽  
pp. 2570-2573
Author(s):  
Hui Wang

Data mining is to discover knowledge which is unknown and hidden in huge database and would be helpful for people understand the data and make decision better. Some knowledge discovered from data mining is considered to be sensitive that the holder of the database will not share because it might cause serious privacy or security problems. Privacy preserving data mining is to hide sensitive knowledge and it is becoming more and more important and attractive. Association rule is one class of the most important knowledge to be mined, so as sensitive association rule hiding. The side-effects of the existing data mining technology are investigated and the representative strategies of association rule hiding are discussed.


2014 ◽  
Vol 10 (1) ◽  
pp. 55-76 ◽  
Author(s):  
Mohammad Reza Keyvanpour ◽  
Somayyeh Seifi Moradi

In this study, a new model is provided for customized privacy in privacy preserving data mining in which the data owners define different levels for privacy for different features. Additionally, in order to improve perturbation methods, a method combined of singular value decomposition (SVD) and feature selection methods is defined so as to benefit from the advantages of both domains. Also, to assess the amount of distortion created by the proposed perturbation method, new distortion criteria are defined in which the amount of created distortion in the process of feature selection is considered based on the value of privacy in each feature. Different tests and results analysis show that offered method based on this model compared to previous approaches, caused the improved privacy, accuracy of mining results and efficiency of privacy preserving data mining systems.


Sign in / Sign up

Export Citation Format

Share Document