

Query and Analysis of Data on Electric Consumption Based on Hadoop



Traditional data management is usually based on relational databases, which are capable of managing small amounts of data. But relational databases have some difficulty in inquiry, management, and analysis of large amounts of data and magnanimity data. The method of effective management of magnanimity data is a problem deserving of study. In this paper, traditional relational databases are moved to Hadoop, in order to implement query and analysis on Hadoop. This paper changes the amount of data record and number of nodes in clusters, and records the query time in different conditions. Advantages and disadvantages of query on Hadoop can be analyzed by comparing the statistics with the query time on relational database Oracle. The factors affecting the time of query on Hadoop can be found by analysis. Furthermore, the result is also a reference material of future research and data managements on cloud platforms.


 2. Interrelated Research
  2.1 Hadoop Platform
  2.2 HDFS
  2.3 MapReduce
  2.4 HBase
  2.5 Hive
  2.6 Sqoop
 3. Processing and Analysis on Hadoop
  3.1 Preparation of the Data
  3.2 Comparison and Analysis
  3.3 Result of the Experiments
  3.4 Comparison of Query Efficiency
  3.5 Analysis of Results
 4. Conclusion


  • Jianjun Zhou Information Science and Technology in Heilongjiang University
  • Yi Wu School of Information Science and Technology, Heilongjiang University Harbin, Heilongjiang, China


자료제공 : 네이버학술정보

    ※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

    0개의 논문이 장바구니에 담겼습니다.