SSiCP : a new SVM based Recursive Feature Elimination Algorithm for Multiclass Cancer Classification

Xiaobo Li; Xue Gong; Xiaoning Peng; Sihua Peng

SSiCP : a new SVM based Recursive Feature Elimination Algorithm for Multiclass Cancer Classification

원문정보

Xiaobo Li, Xue Gong, Xiaoning Peng, Sihua Peng

보안공학연구지원센터(IJMUE) International Journal of Multimedia and Ubiquitous Engineering Vol.9 No.6 2014.06 pp.347-360 SCOPUS

피인용수 : 0건 (자료제공 : 네이버학술정보)

초록

영어

An extremely crucial step in the diagnosis of cancers is to select a small number of informative genes for accurate classification. This issue has become a hot focus in the data mining of gene expression profiles. Especially for data with a large number of cancer types, many conventional classification methods show very poor performance. Here, we proposed a new approach for gene selection and multi-cancer classification based on step-by-step improvement of classification performance (SSiCP). The SSiCP gene selection algorithms were evaluated over the NCI60 and GCM benchmark datasets, with accuracy of 96.6% and 95.5% in 10-fold cross-validation, respectively. Furthermore, the SSiCP outperformed recently published algorithms when applied to another two multi-cancer data sets. Computational evidence indicated that SSiCP can avoid overfitting effectively. Compared with various gene selection algorithms, the implementation of SSiCP is simple and many of the selected genes by SSiCP are shown to be closely related to cancers.

Abstract
1. Introduction
2. Materials and Methods
  2.1. Data Sets
  2.2. Gene Pre-selection
  2.3. RFE: Recursive Feature Elimination
  2.4. Feature Selection Methodology
  2.5. Over-fitting Evaluation of SSiCP Algorithm
  2.6. Confirmation of Classification Algorithm in the Second Step of Feature Selection
  2.7. Parameter Selection on Weka
3. Results
  3.1. Initial Noise Removal and Comparison of Classification Algorithms
  3.2. Gene Selection based on Step-by-step Improvement of Classification Performance
  3.3. Comparison of Computational Results using Four Data Sets
  3.4. Overfitting Evaluation
4. Discussion
5. Conclusion
Acknowledgements
Disclosure
References

키워드

저자정보

Xiaobo Li Department of Computer Science and Technology, College of Engineering, Lishui University, Lishui 323000, China
Xue Gong Department of Microbiology and Immunology, School of Medicine, Stanford University, Stanford, CA 94305-5101, USA
Xiaoning Peng Department of Internal Medicine, School of Medicine, Hunan Normal University, ChangSha 410006, China
Sihua Peng Department of Biological Technology, School of Fisheries and Life Science, Shanghai Ocean University, Shanghai 201306, China

참고문헌

자료제공 : 네이버학술정보

함께 이용한 논문

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

0개의 논문이 장바구니에 담겼습니다.

earticle