A Method of Collecting Mongolian Web page Based on Hyperlink Correlation Degree

Zhiqiang Ma; Rui Yan; Zeguang Zhang; Shuangtao Yang

A Method of Collecting Mongolian Web page Based on Hyperlink Correlation Degree

원문정보

Zhiqiang Ma, Rui Yan, Zeguang Zhang, Shuangtao Yang

보안공학연구지원센터(IJCA) International Journal of Control and Automation Vol.8 No.11 2015.11 pp.361-372 SCOPUS

피인용수 : 0건 (자료제공 : 네이버학술정보)

초록

영어

Since the encoding of Mongolian web pages is not unified and the amount of web pages are is fewer, a method to unify linguistic model and hyperlink analysis is designed to solve the problem. Firstly the web page language identification is carried on by the N-Gram language model, as well as the average distance of language identification is a part of the hyperlink correlation degree. Secondly the hyperlink correlation degree is calculated based on the anchor text, hyperlink increasing and hyperlink depth. Finally the hyperlinks which are sorted by the hyperlink correlation degree become the collecting seeds of the next web page. The experimental results show that the method of collecting Mongolian web page based on hyperlink correlation degree can effectively enhance the information sum, collection speed and the accuracy rate.

키워드

저자정보

Zhiqiang Ma School of Information Engineering, Inner Mongolia University of Technology, Hohhot, China
Rui Yan School of Information Engineering, Inner Mongolia University of Technology, Hohhot, China
Zeguang Zhang School of Information Engineering, Inner Mongolia University of Technology, Hohhot, China
Shuangtao Yang School of Information Engineering, Inner Mongolia University of Technology, Hohhot, China

참고문헌

자료제공 : 네이버학술정보

함께 이용한 논문

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

0개의 논문이 장바구니에 담겼습니다.

earticle