원문정보
초록
영어
A document classifier is an essential tool for classifying the various types of documents being generated in the Big Data era. In recent years, the wide variety of information services available for use with smartphones and portable mobile devices (tablets) have provided a technique that efficiently classifies the quality of sorted data. A common type of document classification scheme is the naïve Bayes classifier. The Naïve Bayes scheme is based on performance classification, which varies widely depending on the method of extraction used in the document. In this paper, we propose a system model that offers feature extraction methods which combine frequency with associated words. This model is then applied to the Naïve Bayes classifier to precisely classify documents. This method is proposed as an alternative to using traditional classification techniques. In addition, experiments will be evaluated by the existing document classification techniques and the proposed techniques.
목차
1. Introduction
2. Related Works
2.1. A Term Frequency-Inverse Document Frequency (TF-IDF) [5-7]
2.2. Naïve Bayes Classifier [8-9]
2.3. Apriori [10-12]
3. Improving Feature Extraction [13]
4. System Model [13]
4.1. Morphological Analysis and Feature Extraction
4.2. Document Classification
5. Experiments and Considerations
6. Conclusion
Acknowledgments
References