원문정보
초록
영어
This study proposes a methodology for constructing linguistic resources to train Natural Language Understanding (NLU) models for the legal counseling service. A dataset based on the language resources we propose is essential for developing non-face-to-face legal services that provide information related to legal problems. The linguistic resources were constructed through a bottom-up analysis of linguistic patterns of legal expressions, background descriptions, and discourse types in online legal counseling texts. Moreover, we analyzed the hierarchical classification of keywords in existing legal service systems and newly determined 20 keywords that belong to 4 representative legal categories. Local Grammar Graphs (LGGs), effective in describing local linguistic phenomena, were adopted to describe various linguistic patterns in this domain. These local language patterns, modularized in LGG format, are converted into Finite State Transducers (FSTs) and generate datasets required for training a language model for NLU. To evaluate this processing, we trained an NLU model of the open-source chatbot architecture Rasa with our dataset. The model performance shows a 0.91 f1-score, which affirms that the linguistic resources and the methodology proposed in this study can be practically applied in developing legal counseling chatbot systems.
목차
1. 서론
2. 관련 연구
3. 연구 방법 개요
3.1. 법률상담 데이터 수집
3.2. 기존 법률상담 서비스의 분류
3.3. 법률상담 도메인 언어자원 구축을 위한 분류 체계 제안
3.4. 법률상담 질의문의 ‘배경(background)’ 표현 분석
3.5. 법률상담 질의문의 담화 표현
4. 법률상담 도메인 NLU를 위한 언어자원 구축
4.1. 법률상담 도메인 핵심 키워드별 언어자원 구성
4.2. ‘배경(background)’ 표현 자원 구성
4.3. 담화 표현 자원 구성
4.4. 모듈별 언어자원의 통합
5. 언어자원의 유용성 검증
5.1. 성능 평가
5.2. 법률상담 도메인 챗봇 LIGA 구현 사례
6. 결론 및 향후 연구
참고문헌