원문정보
초록
영어
With the rapid development of computer science and technology, the data analysis technique has been a hottest research area in the pattern recognition research community. Cluster analysis is an important step in data mining. For clustering, various multi-objective techniques are evolved, which can automatically partition the data. In this paper, we propose a novel multilayer data clustering framework based on feature selection and modified K-Means algorithm. To facilitate the clustering, the proposed algorithm selects a representative feature subset to reduce the dimension of the raw data set. Besides, the selected feature subset has fewer missing values than the raw data set, which may improve the cluster accuracy. Another unique property of the proposed algorithm is the use of partial distance strategy. The experimental analysis and simulation indicate the feasibility and robustness of our method, in the future, we plan to conduct more mathematical analysis to modify our algorithm to achieve better result.
목차
1. Introduction
2. Overview of Clustering Algorithms
2.1. Fuzzy C-Means Algorithm
2.2. The DENCLUE Algorithm
2.3. The Expectation-Maximization (EM) Algorithm
3. Our Proposed Framework
3.1. Feature Selection Through Hierarchical Clustering
3.2. Feature Selection Through Hierarchical Clustering
4. Experimental Analysis and Simulation
4.1. Set-up of the Experiment
4.2. Accuracy Experiment
4.3. Experimental Analysis on Execution Time
5. Conclusion and Summary
Acknowledgements
References
