베이즈 추론에 의한 단어 중의성 해소

김종휘

베이즈 추론에 의한 단어 중의성 해소

원문정보

Korean Word Sense Disambiguation by Bayesian Inference

김종휘

한국언어과학회 언어과학 제15권 2호 2008.06 pp.41-59 KCI 등재

피인용수 : 0건 (자료제공 : 네이버학술정보)

초록

영어

In this paper multiple senses of some Korean ambiguous words are discriminated on the basis of Bayesian inference which utilizes the conditional probability widely accepted in mathematics. A POS tagged 8.1 million words Korean corpus was used as the resource of the linguistic informations for disambiguation. As a result of disambiguational experiment on the 13 words(9 nouns and 4 verbs) by computational programming of the algorithm based on the Bayesian inference, the whole precision accomplished 81.5%(25981/31874), with 83.5%(12546/15030) for nouns and 79.8%(13435/16844) for verbs respectively. In the course of the experiment some parametric variations were engaged to reveal the optimistic condition for this methodological process. The focus was set on the effect of the variation of the smoothing values from 0.9 to 0.0001 which is substituted for the value 0 of the co-occurrence frequency of a word in the context, and to the contrary of general expectations, smoothing value 0.1 resulted in the topmost precision. In addition to the machine process and its promising result, the way how the individual words of the sentences in the corpus are to be treated under the Bayesian inference is exemplified in this paper in detail, thus clarifying the methodological understanding.

키워드

저자정보

김종휘 영산대학교

참고문헌

자료제공 : 네이버학술정보

함께 이용한 논문

※ 기관로그인 시 무료 이용이 가능합니다.

5,400원

0개의 논문이 장바구니에 담겼습니다.

earticle