원문정보
초록
영어
In computer aided diagnosis (CAD) process, one of the most challenging problems is data sparsity, which leads to the diagnosis results are not reliable. This paper proposes a clustering collaborative filtering based algorithm to solve the problem of data sparsity. In this paper, we use k-means clustering algorithm to cluster the same type of patients, and then adopt collaborative filtering method to fill the missing data values for each cluster, in this way to reduce the complexity of similarity calculation of collaborative filtering. The proposed method makes full use of the information-sharing mechanism of "similar patient population" to predict and fill the missing values. A hepatitis dataset is used for evaluating the performance of the algorithm. Results indicate that the proposed algorithm has better performance for medical record data sparsity problem.
목차
1. Introduction
2. CAD Process based on Data Mining
2.1 Process Description
2.2 Processing of Missing Medical Record Data Values
3. Algorithm Description
3.1. K-means Clustering Algorithm
3.2. Collaborative Filtering Algorithm
3.3. K-means CF Algorithm
4. The Results and Analysis of Experiment
4.1. Dataset
4.2. Evaluation Index of Experiment
4.3. Experimental Results and Analysis
5. Conclusion
Acknowledgement
References