earticle

논문검색

Short Text Classification Algorithm Based on Semi-Supervised Learning and SVM

초록

영어

Short text is a popular text form, which is widely used in real-time network news, short commentary, micro-blog and many other fields. With the development of the application such as QQ, mobile phone text messages and movie websites, the size of data is also becoming larger and larger. Most data is useless for us while other data is significant for us. Therefore, it is necessary for us to extract the useful short text from the big data. However, there are many problems with the short text classification, such as fewer features, irregularity and so on. To solve these problems, we should pretreat the short text set first, and then choose the significant features. This paper use semi-supervised learning method and SVM classifier to improve the traditional methods and it can classify a large number of short texts to mining the useful massage from the short text. The experimental results in this paper also show a good promotion.

목차

Abstract
 1. Introduction
 2. Related Research Statuses
 3. Short Text Classifications
  3.1. Pretreatment of Short Text
  3.2. Feature Expression
  3.3. Feature Selecting Methods
  3.4. Feature Weight Calculation
  3.5. Support Vector Machines (SVM)
  3.6. Semi-Supervised Learning
  3.7. Improved Semi-Supervised Learning Algorithm
 4. Experiment Result and Effect Analysis
  4.1. Experimental Data
  4.2. Evaluating Indicator
  4.3. Experimental Results Contrast
 5. Conclusion
 References

저자정보

  • Chunyong Yin Jiangsu Key Laboratory of Meteorological Observation and Information Processing, School of Computer and Software, Jiangsu Engineering Center of Network Monitoring, Nanjing University of Information Science & Technology, Nanjing 210044, China
  • Jun Xiang Jiangsu Key Laboratory of Meteorological Observation and Information Processing, School of Computer and Software, Jiangsu Engineering Center of Network Monitoring, Nanjing University of Information Science & Technology, Nanjing 210044, China
  • Hui Zhang Jiangsu Key Laboratory of Meteorological Observation and Information Processing, School of Computer and Software, Jiangsu Engineering Center of Network Monitoring, Nanjing University of Information Science & Technology, Nanjing 210044, China
  • Zhichao Yin Nanjing No.1 Middle School, Nanjing, Jiangsu, Postal code 210001, China
  • Jin Wang Jiangsu Key Laboratory of Meteorological Observation and Information Processing, School of Computer and Software, Jiangsu Engineering Center of Network Monitoring, Nanjing University of Information Science & Technology, Nanjing 210044, China

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

      0개의 논문이 장바구니에 담겼습니다.