

CHI Statistical Text Feature Selection Method Based on Information Entropy Optimization



CHI statistical text feature selection method based on information entropy optimization is presented in this paper. In the text categorization process of feature selection, considering the results of effect of the distribution within categories and among categories, we introduce the frequency of features information entropy among categories, the information entropy within categories, information within category to optimize the CHI statistical methods. The experimental results show that the classification accuracy of the optimized CHI method is significantly higher than that the traditional CHI statistical methods.


 1. Introduction
 2. CHI Statistics Algorithm Based on Information Entropy Optimization
  2.1. Thought of Information Entropy among Categories
  2.2. Thought of Information Entropy within a Category
  2.3. Feature Frequency Information within a Category
  2.4. CHI Statistical Algorithm is Based on Information Entropy
 3. Experiments and Analysis
  3.1. Dataset
  3.2. Evaluation Criteria
  3.3. Experiment Result
 4. Conclusion


  • Guohua Wu School of Computer Science and Technology Hangzhou Dianzi University, Hangzhou, China
  • Sen Li School of Computer Science and Technology Hangzhou Dianzi University, Hangzhou, China
  • Lin Han School of Computer Science and Technology Hangzhou Dianzi University, Hangzhou, China
  • Mengmeng Zhao School of Computer Science and Technology Hangzhou Dianzi University, Hangzhou, China


자료제공 : 네이버학술정보

    ※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

    0개의 논문이 장바구니에 담겼습니다.