원문정보
초록
영어
DNA microarray technique can detect tens of thousands of genes activity in cells and has been widely used in clinical diagnosis. However, microarray data has characteristics of high dimension and small samples, moreover many irrelevant and redundant genes also decrease performance of classification algorithm .Mutual information is very effective method and has widely been used in feature gene selection, but it cannot directly deal with continuous features. Therefore, this paper proposes a novel feature gene selection method to resolve this problem. Firstly, a lot of irrelevant genes are eliminated from original data by using reliefF algorithm , and the candidate subset of genes is obtained; Secondly, a algorithm based on neighborhood mutual information and forward greedy search strategy which deals with directly continuous features is proposed to select feature genes in above genes subset. Here, because radius of neighborhood greatly affects reduction performance, differential evolution algorithm is applied to optimize radius before reduction. The simulation results on six benchmark microarray datasets show that our method can obtain higher classification accuracy using as few genes as possible, especially neighborhood mutual information can directly continuous features. Feature genes selected has an important meaning for understanding microarray data and finding pathogenic genes of cancer. It is an effective and efficient method for feature genes selection.
목차
1. Introduction
2. ReliefF Algorithm
3. Neighborhood Mutual Information
3.1 Mutual Information
3.2 Neighborhood Mutual Information
3.3 Feature Selection Based On Neighborhood Mutual Information and Forward Greedy Search Strategy
4. Differential Evolution Algorithm
5. Our Proposed Method
6. Experimental Results and Analysis
6.1 Experimental Datasets and Methods
6.2 Experimental Results and Analysis
7. Conclusion
Acknowledgements
References
