원문정보
초록
영어
To improve the research productivity in bioinformatics study by using effective means of large scale data analysis, there are many obstacles that need to be overcome They are standardization of data collection and analysis, management of computing and storage resources, easiness of parallel programming, and efficiency of data analysis job execution, to name a few. Among these, easiness of parallel programming is a crucial factor that contributes to usability and efficiency of large scale data analysis.
This paper describes a biologic data analysis platform based on cloud computing infrastructure. The platform provides an easy-to-use parallel data analysis environment, and ultimately enhances the productivity of bioinformatics research.
목차
1. Introduction
2. Requirements of BioDAP
3. Design of BioDAP
3.1. Virtual Infrastructure
3.2. Biologic Data Integration System
3.3. Data Set and Provenance Management System
3.4. Data Analysis Programming Environment
3.5. Analysis Pipeline Execution Engine
4. Related Work
5. Conclusions
Acknowledgements
References