원문정보
초록
영어
The Internet is frequently used as a medium for exchange of information and opinions, and it is imperative to conduct public opinion analysis to get people’s opinions well understood and guided. In this paper a hybrid public opinion analysis method based on improved clustering and mutual information is proposed. During feature extraction, the weights of words are modified based on Part-of-Speech Tagging to reduce the dimensions of original texts. As for clustering, a novel density peak algorithm is improved and combined with binary search algorithm to determine the cluster number K and initial centers for KMeans. Then hot words extraction, sentiment analysis and trend analysis for each cluster are processed with mutual information to mine useful knowledge to help decision-making. Extensive experiments are conducted on Hadoop, and the results show that our hybrid Public Opinion Analysis method is quite effective and has certain significance.
목차
1. Introduction
2. Text Clustering of News
2.1. Feature Extraction of News
2.2. News Clustering
2.3. Hot Words Extraction
3. Sentiment Analysis
3.1 Sentiment Score
3.2. Trend Analysis
4. Experiments and Evaluation
4.1. Datasets
4.2. Environment
4.3. The Results of Experiments
5. Conclusion and Future Work
Acknowledgement
References