earticle

논문검색

A Q-gram Filter for Local Alignment in Large Genomic Database

초록

영어

Fast and exact searching for sequences similar to a query sequence in genomic databases remains a challenging task in molecular biology. In this paper, the problem of finding all e-matches in a large genomic database is considered, i.e. all local alignments over a given length w and an error rate of at most e. A new database searching algorithm called QFLA is designed to solve this problem. The proposed algorithm is a full-sensitivity algorithm which is a refined q-gram filter and implemented on a q-gram index. First, new features are extracted from match-regions by logically partitioning both query sequence and genomic database. Second, a large part of irrelevant subsequences are eliminated quickly by these new features during the searching process. Last, the unfiltered regions are verified by the well-known smith-waterman algorithm. The experimental results demonstrate that our algorithm saves time by improving filtration efficiency in a short filtration time.

목차

Abstract
 1. Introduction
 2. Preliminaries
 3. A Refined Q-gram Filter
  3.1. Match-region Feature Extraction Based on Partition
  3.2. New Filter
  3.3. Invalidation and Degeneration
 4. Analysis
 5. Experimental Results
  5.1. Experimental Environment
  5.2. Parameter z
  5.3. Performance
  5.4. Discussion
 6. Conclusion
 References

저자정보

  • Decai Sun College of Information Science and Technology, Bohai University, Jinzhou 121013, China
  • Xiaoxia Wang Teaching and Research Institute of College Computer, Bohai university, Jinzhou 121013, China.

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

      0개의 논문이 장바구니에 담겼습니다.