earticle

논문검색

An Efficient Compression Algorithm for Forthcoming New Species

초록

영어

Genomic repositories gradually increase individual and reference sequences, which shares long identical and near-identical strings of nucleotides. In this paper a lossless DNA data compression technique called Optimized Base Repeat Length DNA Compression (OBRLDNAComp) has been proposed, based upon redundancy of DNA sequences. For easy storage, retrieval time reducing and to find similarity within and between sequences compression is mandatory. OBRLDNAComp searches long identical and near-identical strings of nucleotides which are overlooked by other DNA specific compression algorithms. This technique is an optimal solution of longest possible exact repeat benefits towards compression ratio. It scans a sequence horizontally from left to right to find statistic of repeats then follow substitution technique to compress those repeats. The algorithm is straightforward and does not need any external reference file; it scans the individual file for compression and decompression. The achieved compression ratio 1.673 bpb outperforms many non-reference based compression methods.

목차

Abstract
 1. Introduction
 2. Background
 3. Proposed Method
  3.1. First Pass
  3.2. Second Pass
  3.3. OBRLDNAComp
  3.4. Decompression
 4. Experimental Results
 5. Concluding Remarks
 References

저자정보

  • Subhankar Roy Department of Computer Science and Engineering, Academy of Technology, G. T. Road, Aedconagar, Hooghly - 712121, W.B., India
  • Sudip Mondal Department of Computer Science and Engineering, University of Calcutta, 92, A.P.C. Road, Kolkata - 700009, W.B., India
  • Sunirmal Khatua Department of Computer Science and Engineering, University of Calcutta, 92, A.P.C. Road, Kolkata - 700009, W.B., India
  • Moumita Biswas Department of Computer Science and Engineering, University of Calcutta, 92, A.P.C. Road, Kolkata - 700009, W.B., India

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

      0개의 논문이 장바구니에 담겼습니다.