원문정보
초록
영어
Word sense disambiguation has always been a key problem in Natural Language Processing. In the paper, we use the method of Information Gain to calculate the weight of different position's context, which affect to ambiguous words. And take this as the foundation. We select the ahead and back six position’s context of ambiguous words to construct the feature vectors. The feature vectors are endued with different value of weight in Bayesian Model. Thus, the Bayesian Model is improved. We use the sense of the HowNet to describe the meaning of ambiguous words. The average accuracy rate of the experiments of 10 Chinese ambiguous words was 95.72% in close test and the average accuracy rate was 85.71% in open test. The results showed that the method was proposed in this paper were very effective.
목차
1. Introduction
2. The improved model for Information Gain based on Bayesian model in Word Sense Disambiguation
2.1. The processing of Word Sense Disambiguation
2.2. HowNet
2.3. Information Gain
2.4. Improved the Bayesian Model of Word Sense Disambiguation based on Information Gain
2.5. Deal with Data smoothing
3. Experiments and Results Analysis
3.1. Experimental Process
3.2. Results Analysis
4. Conclusions
References