earticle

논문검색

Comparison Architecture for Large Number of Genomic Sequences

원문정보

Hae-won Choi, Myung-Chun Ryoo, Joon-Ho Park

피인용수 : 0(자료제공 : 네이버학술정보)

초록

영어

Generally, a suffix tree is an efficient data structure since it reveals the detailed internal structures of given sequences within linear time. However, it is difficult to implement a suffix tree for a large number of sequences because of memory size constraints. Therefore, in order to compare multimega base genomic sequence sets using suffix trees, there is a need to re-construct the suffix tree algorithms. We introduce a new method for constructing a suffix tree on secondary storage of a large number of sequences. Our algorithm divides three files, in a designated sequence, into parts, storing references to the locations of edges in hash tables. To execute experiments, we used 1,300,000 sequences around 300Mbyte in EST to generate a suffix tree on disk.

목차

Abstract
1. Introduction
2. Proposed Method
2.1 Data structure
2.2 Storing Edges
2.3 Node Numbering Process
2.4 Storing a Hash Table
3. Experimentation and Analysis
4. Discussion and Conclusion
References

저자정보

  • Hae-won Choi Department of Computer Engineering, Kyungwoon University, Korea
  • Myung-Chun Ryoo Department of Computer Engineering, Kyungwoon University, Korea
  • Joon-Ho Park Department of Computer Engineering, Kyungwoon University, Korea

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 기관로그인 시 무료 이용이 가능합니다.

      • 4,000원

      0개의 논문이 장바구니에 담겼습니다.