earticle

논문검색

A Graph Theoretical Preprocessing Step for Text Compression

초록

영어

This paper presents CSGM2, a text preprocessing technique for compression purposes. It converts the original text into a word net (graph representation) and can retain the detailed contextual information such as word proximity. Specific directed graph is proposed to model this word net where words are stored in vertices and edges represent word transitions. The word net is fully capable of holding the natural word order in the original text and hence can be used directly for encoding purposes.

목차

Abstract
 1. Introduction
 2. Related Work
  2.1 Statistical Compression
  2.2. Dictionary-based Compression
  2.3. Preprocessing-based Compression
 3. Natural Language Text Modeling and Compression
  3.1. Graph-based Modeling for Natural Language Texts
 4. The CSGM2 Transformation
  4.1. Basic Concepts of Graph Theory
  4.2. Word Net Building
  4.3 Transforming Natural Language Text through a Word Net
 5. Example
 6. Conclusion
 References

저자정보

  • Kaushik K. Phukon Department of Computer Science, Gauhati University
  • Hemanta K. Baruah Vice-Chancellor, Bodoland University Assam, India.

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

      0개의 논문이 장바구니에 담겼습니다.