A NOVEL PRIVACY PRESERVING DATA MINING ALGORITHM

Document Type : Original Article

Authors

1 (Prof.) Dean of Faculty of Computers and information, Cairo University.

2 (Ass. Prof.) College of Engineering and Technology Arabic Academy for Science, Technology and Maritime Transport (AAST).

3 (Ph.D.) Egyption Armed Forces.

4 (Eng.) Egyption Armed Forces.

Abstract

In recent years, there have been privacy concerns over the increase of gathering personal data by various institutions and merchants over the Internet. There has been increasing interest in the problem of building accurate data mining models over aggregate data while protecting privacy at the level of individual records. One approach for this problem is to randomize the values in individual records, and only disclose the randomized values. This method is able to retain privacy while accessing the information implicit in the original attributes. The distribution of the original data set is important and estimating it is one of the goals of the data mining algorithms. In this paper, a novel privacy preserving data mining algorithm based on the use of Artificial Neural Network (ANN) is introduced. The ANN model is based on single layer neural network (adaptive linear neuron network (ADALINE)), and it is used to reconstruct the original distribution. The paper also introduces a comparative study with two of the most recent algorithms that handled this issue. Our empirical results show that the new algorithm can reconstruct the original data distribution with a very high degree of precision.