원문정보
초록
영어
Traditional data management is usually based on relational databases, which are capable of managing small amounts of data. But relational databases have some difficulty in inquiry, management, and analysis of large amounts of data and magnanimity data. The method of effective management of magnanimity data is a problem deserving of study. In this paper, traditional relational databases are moved to Hadoop, in order to implement query and analysis on Hadoop. This paper changes the amount of data record and number of nodes in clusters, and records the query time in different conditions. Advantages and disadvantages of query on Hadoop can be analyzed by comparing the statistics with the query time on relational database Oracle. The factors affecting the time of query on Hadoop can be found by analysis. Furthermore, the result is also a reference material of future research and data managements on cloud platforms.
목차
1.Introduction
2. Interrelated Research
2.1 Hadoop Platform
2.2 HDFS
2.3 MapReduce
2.4 HBase
2.5 Hive
2.6 Sqoop
3. Processing and Analysis on Hadoop
3.1 Preparation of the Data
3.2 Comparison and Analysis
3.3 Result of the Experiments
3.4 Comparison of Query Efficiency
3.5 Analysis of Results
4. Conclusion
Acknowledgment
References