원문정보
초록
영어
The study on word similarity computation plays an important role in natural language processing (NLP). Recently the algorithm based on HowNet is widely used and proves to work well in Chinese word similarity computation. However, the relationship between the number of brother nodes and the fineness of the hierarchy is not considered. This paper investigates the ratio of two words on the brother nodes’ number called sememe probability density and proposes an improved algorithm based on HowNet. The results indicate that the correlation measure of the algorithm presented by this paper is 75.4%, and it is much better than the major state-of-the-art method (68.1%).
목차
1. Introduction
2. Related Work
3. Algorithm
3.1 HowNet
3.2 Similarity between Sememes
3.3 Similarity between Sets
3.4 Similarity between Concepts
3.5 Similarity between words
4. Evaluation
4.1 Data Set and Setting
4.2 Experimental Results
5. Conclusions
ACKNOWLEDGEMENTS
References
