원문정보
보안공학연구지원센터(IJGDC)
International Journal of Grid and Distributed Computing
Vol.7 No.4
2014.08
pp.149-156
피인용수 : 0건 (자료제공 : 네이버학술정보)
초록
영어
Adopting focused crawler to search web sites is the trend of next generation search engines. Design and implementation of a focused crawler - TargetCrawler is introduced in detail, including its overall architecture, main modules, working processes and two key algorithms, duplicate removing algorithm based on the Bloom filter and ranking algorithm based on priority which are designed to ensure accuracy and efficiency of web search. Experimental results show the effectiveness of the scheme.
목차
Abstract
1. Introduction
2. System Frameworks
2.1. The Overall Design
2.2. Crawling Process
3. Key Algorithms
3.1. Bloom Filter-based Duplicate Removing Algorithm
3.2. Priority-based Sorting Algorithm
4. Experiments and Analysis
4.1. Experimental Environment
4.2. Experimental Results and analysis
5. Conclusions
Acknowledgements
References
1. Introduction
2. System Frameworks
2.1. The Overall Design
2.2. Crawling Process
3. Key Algorithms
3.1. Bloom Filter-based Duplicate Removing Algorithm
3.2. Priority-based Sorting Algorithm
4. Experiments and Analysis
4.1. Experimental Environment
4.2. Experimental Results and analysis
5. Conclusions
Acknowledgements
References
저자정보
참고문헌
자료제공 : 네이버학술정보