earticle

논문검색

Study on the Distributed Crawling for Processing Massive Data in the Distributed Network Environment

초록

영어

Due to the development of IT, distribution of smart phone, and an increase of use of SNS, various types of contents are being produced and consumed in Internet. Therefore, information searching technology has become important due to a sharp rise in data. However, information searching technology requires much of background knowledge and hence has been recognized as what was difficult to access to. Issues with previous search engine were how many of qualified personnel with background knowledge along with huge amount of development expenses were required. Therefore, search engines have been recognized as what was exclusively possessed by leading IT companies or specialized organizations. This study is intended to suggest a search engine with an index structure for making it convenient to effectively search information by distributed crawling massive amount of websites and web-documents in the distributed environment. Search engine suggested in this study has been realized by Hadoop structure for supporting the distributed processing.

목차

Abstract
 1. Introduction
 2. Related Researches
  2.1. Hadoop
  2.2. YARN
  2.3. Lucene
  2.4. Zookeeper
 3. System Design and Experiment
  3.1. Environment Setup
  3.2. Server Environment Setup
  3.3. Experiment
 4. Conclusion
 References

저자정보

  • Chang-Su Kim PaiChai University, 155-40, Baejae-ro, SeoGu, DaeJeon, Korea

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

      0개의 논문이 장바구니에 담겼습니다.