earticle

논문검색

IT Marketing and Policy

Design of Distributed Cloud System for Managing large-scale Genomic Data

초록

영어

The volume of genomic data is constantly increasing in various modern industries and research fields. This growth presents new challenges and opportunities in terms of the quantity and diversity of genetic data. In this paper, we propose a distributed cloud system for integrating and managing large-scale gene databases. By introducing a distributed data storage and processing system based on the Hadoop Distributed File System (HDFS), various formats and sizes of genomic data can be efficiently integrated. Furthermore, by leveraging Spark on YARN, efficient management of distributed cloud computing tasks and optimal resource allocation are achieved. This establishes a foundation for the rapid processing and analysis of large-scale genomic data. Additionally, by utilizing BigQuery ML, machine learning models are developed to support genetic search and prediction, enabling researchers to more effectively utilize data. It is expected that this will contribute to driving innovative advancements in genetic research and applications.

목차

Abstract
1. INTRODUCTION
2. PROPOSED SYSTEM
2.1 System Components
2.2. BigQuery ML
2.3 Operational Procedure
3. COMPARSION OF SYSTEMS
4. CONCLISION
ACKNOWLEDGMENT
References

저자정보

  • Seine Jang The master’s course, Graduate School of Smart Convergence, Kwangwoon University, Seoul
  • Seok-Jae Moon Professor, Graduate School of Smart Convergence, Kwangwoon University, Seoul, Korea

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 기관로그인 시 무료 이용이 가능합니다.

      • 4,000원

      0개의 논문이 장바구니에 담겼습니다.