원문정보
초록
영어
With the explosive growth of the web information, acquiring the target information quickly, precisely and effectively in a large number of network information is restricted by many factors. In response to it, this paper analyzed the key technician part in the network information search engine-web crawler technology and proposed a network information target search model based on the web vertical crawler system with discussion on the implementation of the corresponding search strategy. Firstly, it builds the structure of the web crawler system and analyzes different function models. Next, it discusses some crucial problems including the options of deleting duplicated URL, the strategies to choose duplicated URL deletion method and the control model of error probability estimate model to acquire the closest network information to the target information. Finally, it verifies and discusses the operability and effectiveness of the model and the implementation strategy through case study.
목차
1. Introduction
2. Build the Structure of the Web Vertical Crawler System
3. The Options and Strategies to Delete the Duplicated URL in the Web Vertical Crawler System
3.1. The Options to Delete the Duplicated URL
3.2. Implementation Strategy of Deleting the Duplicated URL
4. The Control Model of the Search Estimate Error Probability of Web Vertical Crawler
5. Case Validation and Analysis
6. Conclusion
Acknowledgment
References