원문정보
초록
영어
The spatial data set has much useful information, but the amount of volume is massive and the type is complex. It makes hard to analyze the spatial data. There are software tools for general data. Hadoop is one of the tools to process the big data. Hadoop can be used to analyze the large amount of spatial data. This paper proposed a data analysis technique for massive spatial data using Hadoop. We extend the grid based clustering algorithm to use Hadoop. The grid based clustering algorithm makes clusters with cells. Each cell has a number that counts contained objects. Only the cells who had the sufficient population can be join in clusters. The other cells ignored as noise. This paper proposed to enhance performance using Hadoop. In order to evaluate the enhancement of performance, the execution time is measured and compared. As the result, the proposed algorithm is 1.8 times faster than the original grid based clustering algorithm.
목차
1. Introduction
2. Related Works
2.1. Hadoop MapReduce
2.2. Grid Based Clustering Algorithm
3. Grid Based Clustering Algorithm Using Hadoop MapReduce
3.1. Map Method
3.2. Reduce Method
3.3. Clustering Method
4. Experiment and Result
4.1. Data Set and Implementing the Experiment
4.2. Result
5. Conclusion
Acknowledgments
References