earticle

논문검색

Hadoop based Weblog Analysis : A Review

초록

영어

The growth of websites and the Internet has opened up new research, social, entertainment, education and business opportunities. With the fast growth of the Internet, the digital data generated by the websites is becoming so massive that the traditional text software and relational database technology faces a bottleneck while processing such massive data and the results generated by these technologies are not satisfactory. Cloud computing offers a good solution for this problem. Cloud computing is not only capable of storing such massive data but also capable of processing and analyzing such voluminous data faster, by making use of distributed storage and distributed computing technology. A weblog is a group of connected web pages that consists of a log or daily record of information, particular fields or views which is altered, every now and then, by owner of site, other websites or by website users. An enterprise weblog analysis system based on Hadoop architecture with Hadoop Distributed File System (HDFS), Hadoop MapReduce Software Framework and Pig Latin Language aids the business decision-making process of the system administrators and helps them to collect and identify the potential value which is hidden within such huge data generated by the websites. Such a weblog analysis includes the analysis of an Internet site’s entry log as well as provides information about the amount of visitors, days of week and rush hours, views, hits, very often accessed pages, application server traffic trends, performance reports at varying intervals and statistical reports which indicate the performance of program.

목차

Abstract
 1. Introduction
  1.1. History of Weblog Analysis using Hadoop
  1.2. Various Steps to Perform a Weblog Analysis using Hadoop
 2. Literature Review
  2.1. Various Indicators derived through Weblog Analysis
  2.2. Hadoop
  2.3. Hadoop Distributed File System (HDFS)
  2.4. Hadoop MapReduce
  2.5. Pig Programming Language
 3. Results and Comparisons
 4. Conclusion
 5. Future Scope
 References

저자정보

  • Pooja D. Savant Dept of Information Technology, Bharati Vidyapeeth Deemed University College of Engineering, Pune-411043, India
  • Debnath Bhattacharyya Department of Computer Science and Engineering, Vignan’s Institute of Information Technology, Visakhapatnam-530049, India
  • Tai-hoon Kim Department of Convergence Security, Sungshin Women's University, 249-1, Dongseon-dong 3-ga, Seoul, 136-742, Korea

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

      0개의 논문이 장바구니에 담겼습니다.