earticle

논문검색

Thesaurus-Based Semantic Smoothing in Language Modeling for Chinese Document Retrieval

원문정보

Liqi Gao, Ting Liu, Ru Chen, Yu Zhang

피인용수 : 0(자료제공 : 네이버학술정보)

초록

영어

Language modeling for Information Retrieval proposed a few years ago has been attractive
and improved the performance of IR systems effectively comparing to classic models and
approaches. Smoothing technology in parameter estimations is one of main problems in carrying
out language models. The performance of IR system will be enhanced by effective smoothing
methods. Semantic smoothing has been developed recently for language modeling with some
knowledge of language. This paper presents a modification to a smoothing approach in general
language model combining with translation modeling, which is taking synonyms in documents
and the collection into account for semantic smoothing and performance improving in Chinese
document retrieval. The synonym knowledge is from a well‐known thesaurus in Chinese NLP,
called Tongyici Cilin (Extended). A comparison shows that the semantic smoothed approach
brings approximately 1.33% improvement on average.

목차

Abstract
 1. Introduction
 2. Related Work
 3. Semantic GLM for IR
 4. Evaluation
 5. Conclusion
 References

저자정보

  • Liqi Gao Information Retrieval Laboratory, School of Computer Science & Technology, Harbin Institute of Technology
  • Ting Liu Information Retrieval Laboratory, School of Computer Science & Technology, Harbin Institute of Technology
  • Ru Chen Information Retrieval Laboratory, School of Computer Science & Technology, Harbin Institute of Technology
  • Yu Zhang Information Retrieval Laboratory, School of Computer Science & Technology, Harbin Institute of Technology

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 기관로그인 시 무료 이용이 가능합니다.

      • 4,000원

      0개의 논문이 장바구니에 담겼습니다.