Web Spam Detection Based On Link Diversity and Content Features

Xu Gongwen; Li Xiaomei; Zhang Zhijun; Xu Li’Na

Web Spam Detection Based On Link Diversity and Content Features

원문정보

Xu Gongwen, Li Xiaomei, Zhang Zhijun, Xu Li’Na

보안공학연구지원센터(IJSIA) International Journal of Security and Its Applications Vol.10 No.7 2016.07 pp.363-372 SCOPUS

피인용수 : 0건 (자료제공 : 네이버학술정보)

초록

영어

In order to get a higher ranking, spam pages deceive the search engine using cheating technology, which will disturb the users to find useful information via search engine. The web spam is designed for search engines rather than for users, so it is important to make a distinction between the normal web pages and the web spam pages. The links of the normal web pages have a wide variety of sources and the content feature of the normal web pages are distributed regularly, while links source of the web spam pages is single and the content features of them are distributed disorderly. So after analyzing the link diversity and content features distribution of the web pages, a new web page ranking algorithm was proposed in this paper. In this method, the web pages ranking score is calculated by the TrustRank method combining web pages links diversity and the web pages content features. It can be shown from the experimental results that this method can effectively reduce spam pages ranking score.

Abstract
1. Introduction
2. Web Link and Content Features
  2.1. Link Diversity
  2.2. Content Features of Web Pages
  2.3. Ranking combining link diversity and content features
3. Experiment and Results
  3.1. Dataset
  3.2. Measurement Standard
  3.3. Results
4. Conclusion
Acknowledgements
References

키워드

저자정보

Xu Gongwen School of Computer Science and Technology Shandong Jianzhu University
Li Xiaomei Cancer Center of the Second Hospital Shandong University
Zhang Zhijun School of Computer Science and Technology Shandong Jianzhu University
Xu Li’Na School of Computer Science and Technology Shandong Jianzhu University

참고문헌

자료제공 : 네이버학술정보

함께 이용한 논문

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

0개의 논문이 장바구니에 담겼습니다.

earticle