Comparison Architecture for Large Number of Genomic Sequences

Hae-won Choi; Myung-Chun Ryoo; Joon-Ho Park

Comparison Architecture for Large Number of Genomic Sequences

원문정보

Hae-won Choi, Myung-Chun Ryoo, Joon-Ho Park

한국EA학회 정보화연구 제9권 1호 2012.03 pp.11-19 KCI 등재

피인용수 : 0건 (자료제공 : 네이버학술정보)

초록

영어

Generally, a suffix tree is an efficient data structure since it reveals the detailed internal structures of given sequences within linear time. However, it is difficult to implement a suffix tree for a large number of sequences because of memory size constraints. Therefore, in order to compare multimega base genomic sequence sets using suffix trees, there is a need to re-construct the suffix tree algorithms. We introduce a new method for constructing a suffix tree on secondary storage of a large number of sequences. Our algorithm divides three files, in a designated sequence, into parts, storing references to the locations of edges in hash tables. To execute experiments, we used 1,300,000 sequences around 300Mbyte in EST to generate a suffix tree on disk.

키워드

저자정보

Hae-won Choi Department of Computer Engineering, Kyungwoon University, Korea
Myung-Chun Ryoo Department of Computer Engineering, Kyungwoon University, Korea
Joon-Ho Park Department of Computer Engineering, Kyungwoon University, Korea

참고문헌

자료제공 : 네이버학술정보

함께 이용한 논문

※ 기관로그인 시 무료 이용이 가능합니다.

4,000원

0개의 논문이 장바구니에 담겼습니다.

earticle