원문정보
보안공학연구지원센터(IJGDC)
International Journal of Grid and Distributed Computing
Vol.9 No.10
2016.10
pp.311-320
피인용수 : 0건 (자료제공 : 네이버학술정보)
초록
영어
Eliminating noisy information and extracting information content from web pages are increasing to become an important research issue in information retrieval field. In this paper, we present an approach of information extraction based on Dom tree and weight value calculation, which contains the following steps, parse the web page to construct the Dom tree, extract the title and keywords, calculate the weight value and obtain the content. The experimental result shows that this method has the higher accuracy ratio by the various themes content extraction.
목차
Abstract
1. Introduction
2. Related Work
3. Information Extraction Course
3.1. Rule Induction with ML
3.2. Determining Body Information Block
3.3. Information Extraction Steps
4. Experiment and Analysis
5. Conclusions
References
1. Introduction
2. Related Work
3. Information Extraction Course
3.1. Rule Induction with ML
3.2. Determining Body Information Block
3.3. Information Extraction Steps
4. Experiment and Analysis
5. Conclusions
References
저자정보
참고문헌
자료제공 : 네이버학술정보