원문정보
Segmentation of Korean Compound Nouns Using Semantic Category Analysis of Unregistered Nouns
초록
영어
This paper proposes a method of segmenting compound nouns which include unregistered nouns into a correct combination of unit nouns using characteristics of person's names, loanwords, and location names. Korean person's name is generally composed of 3 syllables, only relatively small number of syllables is used as last names, and the second and the third syllables combination is somewhat restrictive. Also many person's names appear with clue words in compound nouns. Most loanwords have one or more syllables which cannot appear in Korean words, or have sequences of syllables different from usual Korean words. Location names are generally used with clue words designating districts in compound nouns. Use of above characteristics to analyze compound nouns not only makes segmentation more accurate, helps natural language systems use semantic categories of those unregistered nouns. Experimental results show that the precision of our method is approximately 98% on average. The precision of human names and loanwords recognition is about 94% and about 92% respectively.
목차
1. 서론
2. 복합명사에서 미등록어로 인한 오류
3. 외래어 인식
3.1 음절 출현 특성
3.2 음소 결합 특성
4. 이름 명사 인식
4.1 실마리 단어 구축
5. 지명 인식
6. 시스템 구성
7. 실험 및 분석
8. 결론
참고문헌