원문정보
초록
영어
This article is two-fold. The ultimate goal of this article is to provide a big data analysis of 330 reviews of the movie Noryang and to evaluate the Naive Bayes model, the Random Forests model, the DNN model, and the LSTM model in machine learning and deep learning. A point to note is that the name Yi, Sun-shin was the most widely used by viewers, followed by the word movie, and the word general, in that order. A major point of this article is that the name Yi, Sun-shin and the word movie showed up twice as the first keyword. This in turn implies that these keywords are the most noteworthy ones. The sentiment analysis argues that about 75% of viewers think of the film as well-made and that they were highly satisfied with it. In this paper, we used the Naive Bayes model, the Random Forests model, the DNN model, and the LSTM model and made them predict whether each review is positive or negative. The Random Forests model works well for our data, whereas the Naive Bayes model does not. When learning took place 25 times, the DNN model worked well for our data (its accuracy rate is 82.76%). When it comes to the LSTM model, its accuracy did not improve even though learning took place 9 times. Yet, the LSTM model is slightly better than the DNN model with respect to the accuracy rate of test data.
목차
II. Methods
III. A Big Data Analysis of 330 Reviews
1. The Total Term Frequency
2. The Word Cloud Representing 330 Reviews
3. A Network Analysis
4. Clusters
5. Topics
6. A Sentiment Analysis
IV. The Predictive Power of Four Models in Machine Learning and Deep Learning
1. Data
2. Naive Bayes Model and Random Forests Model
3. DNN Model and LSTM Model
V. Conclusion
Works Cited
Abstract