An Empirical Study for Handling Scientific Datasets

Yunhee Kang; Heeyoul Choi

An Empirical Study for Handling Scientific Datasets

원문정보

Yunhee Kang, Heeyoul Choi

보안공학연구지원센터(IJGDC) International Journal of Grid and Distributed Computing Vol.5 No.3 2012.09 pp.111-120

피인용수 : 0건 (자료제공 : 네이버학술정보)

초록

영어

Since the volume of data generated by a scientific data experiment has grown exponentially, new scientific methods to analyze and organize the data are required. Hence, these methods need to be used effective infrastructure composed of computing resources that are used for pre-processing and post-processing data. The demanding requirement has led to development of methods to reduce the size of dataset and to apply a new programming model and its implementation like MapReduce. In this paper, we describe an empirical study for handling the dataset of a scientific data experiment to support data transformation, which is an essential phase to handling large-scale data in scientific data experiments. In this experiment we show a way to optimize the dataset written in netCDF by a data reduction as a sub-setting method and to process the dataset about tornado outbreak in the US by Hadoop, a MapReduce framework. These methods can be applied to pre-processing and post-processing in scientific data experiments.

키워드

저자정보

Yunhee Kang Baekseok University, Samsung Advanced Institute of Technology
Heeyoul Choi Baekseok University, Samsung Advanced Institute of Technology

참고문헌

자료제공 : 네이버학술정보

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

0개의 논문이 장바구니에 담겼습니다.

earticle