earticle

논문검색

숫자포함 시간표현의 정보추출을 위한 언어 기술

원문정보

남지순

피인용수 : 0(자료제공 : 네이버학술정보)

초록

영어

In this study, we described linguistic patterns of Korean time expressions including digits observed in real texts such as on-line daily newspapers. Time information is one of the most important information that any information extraction systems need to recognize automatically. As this information is conveyed by some linguistic patterns specific to each natural language, the exhaustive and reliable description of these expressions is strongly required. We first observed some types of time expressions from on-line newspapers and then classified them into 13 classes. We presented these classes under the LGG formalism which is adequate to present finite-local constraints. The Unitex system, conceived for the compilation of this formalism, makes LGG graphs converted into finite-state automata and applied during text analysis to extract time expressions as presented in the graphs.

목차

1. 들어가기
 2. 신문기사 텍스트에서 관찰되는 숫자 시간표현
  2.1. 숫자 시간표현 코퍼스
  2.2. 연구 범위의 한정
  2.3. 하위 분류의 방법 및 기준
 3. 숫자포함 시간표현 인식을 위한 정규문법
  3.1. 정규표현과 오토마타
  3.2. 유한 그래프문법 LGG
  3.3. 숫자포함 시간표현 LGG-DigitTimex
 4. 맺음말
 참고문헌

저자정보

  • 남지순 한국외국어대학교

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 기관로그인 시 무료 이용이 가능합니다.

      • 6,900원

      0개의 논문이 장바구니에 담겼습니다.