원문정보
보안공학연구지원센터(IJUNESST)
International Journal of u- and e- Service, Science and Technology
Vol.9 No.8
2016.08
pp.227-236
피인용수 : 0건 (자료제공 : 네이버학술정보)
초록
영어
This paper analyzes the shortcomings of traditional hidden Markov crawler, makes some improvements on the clustering strategy of web pages and the judgment algorithm for determining the correlation of pages or hyperlinks with the topic; and brings forward an AHMM (Adaptive Hidden Markov Model) modeling method. The experimental results shows that the improved AHMM is much more efficient than the traditional HMM.
목차
Abstract
1. Introduction
2. The Shortcomings of Traditional HMM Crawler
3. The Overall Framework of HMM Crawler
4. Training Set Page Clustering Strategy of AHMM Crawler
5. The Selection Method of AHMM Crawler for the Page to be Collected
5.1. Judgment on the Correlation Degree Between the Page and the Topic
5.2. Judgment on the Correlation Degree Between URLs and the Topic
6. HMM Modeling in AHMM Crawlers
7. Implementation and Experimental Analysis
7.1. The HMM Training
7.2. The HMM Path Prediction
7.3. Experimental Analysis
8. Conclusion
References
1. Introduction
2. The Shortcomings of Traditional HMM Crawler
3. The Overall Framework of HMM Crawler
4. Training Set Page Clustering Strategy of AHMM Crawler
5. The Selection Method of AHMM Crawler for the Page to be Collected
5.1. Judgment on the Correlation Degree Between the Page and the Topic
5.2. Judgment on the Correlation Degree Between URLs and the Topic
6. HMM Modeling in AHMM Crawlers
7. Implementation and Experimental Analysis
7.1. The HMM Training
7.2. The HMM Path Prediction
7.3. Experimental Analysis
8. Conclusion
References
저자정보
참고문헌
자료제공 : 네이버학술정보