초록
영어
Language modeling for Information Retrieval proposed a few years ago has been attractive
and improved the performance of IR systems effectively comparing to classic models and
approaches. Smoothing technology in parameter estimations is one of main problems in carrying
out language models. The performance of IR system will be enhanced by effective smoothing
methods. Semantic smoothing has been developed recently for language modeling with some
knowledge of language. This paper presents a modification to a smoothing approach in general
language model combining with translation modeling, which is taking synonyms in documents
and the collection into account for semantic smoothing and performance improving in Chinese
document retrieval. The synonym knowledge is from a well‐known thesaurus in Chinese NLP,
called Tongyici Cilin (Extended). A comparison shows that the semantic smoothed approach
brings approximately 1.33% improvement on average.
목차
1. Introduction
2. Related Work
3. Semantic GLM for IR
4. Evaluation
5. Conclusion
References