자연어처리 모델을 이용한 이커머스 데이터 기반 감성 분석 모델 구축

최준영; 임희석

자연어처리 모델을 이용한 이커머스 데이터 기반 감성 분석 모델 구축

원문정보

E-commerce data based Sentiment Analysis Model Implementation using Natural Language Processing Model

최준영, 임희석

한국융합학회 한국융합학회논문지 제11권 제11호 2020.11 pp.33-39 KCI 등재

피인용수 : 0건 (자료제공 : 네이버학술정보)

초록

영어

In the field of Natural Language Processing, Various research such as Translation, POS Tagging, Q&A, and Sentiment Analysis are globally being carried out. Sentiment Analysis shows high classification performance for English single-domain datasets by pretrained sentence embedding models. In this thesis, the classification performance is compared by Korean E-commerce online dataset with various domain attributes and 6 Neural-Net models are built as BOW (Bag Of Word), LSTM[1], Attention, CNN[2], ELMo[3], and BERT(KoBERT)[4]. It has been confirmed that the performance of pretrained sentence embedding models are higher than word embedding models. In addition, practical Neural-Net model composition is proposed after comparing classification performance on dataset with 17 categories. Furthermore, the way of compressing sentence embedding model is mentioned as future work, considering inference time against model capacity on real-time service.

한국어

자연어 처리 분야에서 번역, 형태소 태깅, 질의응답, 감성 분석등 다양한 영역의 연구가 활발히 진행되고 있다. 감성 분석 분야는 Pretrained Model을 전이 학습하여 단일 도메인 영어 데이터셋에 대해 높은 분류 정확도를 보여주 고 있다. 본 연구에서는 다양한 도메인 속성을 가지고 있는 이커머스 한글 상품평 데이터를 이용하고 단어 빈도 기반의 BOW(Bag Of Word), LSTM[1], Attention, CNN[2], ELMo[3], KoBERT[4] 모델을 구현하여 분류 성능을 비교하였 다. 같은 단어를 동일하게 임베딩하는 모델에 비해 문맥에 따라 다르게 임베딩하는 전이학습 모델이 높은 정확도를 낸다 는 것을 확인하였고, 17개 카테고리 별, 모델 성능 결과를 분석하여 실제 이커머스 산업에서 적용할 수 있는 감성 분석 모델 구성을 제안한다. 그리고 모델별 용량에 따른 추론 속도를 비교하여 실시간 서비스가 가능할 수 있는 모델 연구 방향을 제시한다.

요약
Abstract
1. 서론
2. 관련 연구
3. 감성 분석 모델
3.1 BOW(Bag of Word) 모델
3.2 BDLSTM(BiDirectional Long Short Term Memory) 모델
3.3 BDLSTM(BiDirectional Long Short Term Memory) + Attention 모델
3.4 CNN(Convolutional Neural Net) 모델
3.5 ELMo(Embeddings from Language Models)모델
3.6 KoBERT(Korean Bidirectional Embedding Representation Transformer) 모델
4. 모델 구현
4.1 Dataset 설명
4.2 형태소 분석기
4.2 모델 실험 시스템 환경
4.3 구현 감성 분석 모델 구성
5. 모델 성능 결과
6. 결론 및 향후 계획
REFERENCES

키워드

저자정보

최준영 Jun-Young Choi. 고려대학교 컴퓨터정보통신대학원 석사과정
임희석 Heui-Seok Lim. 고려대학교 컴퓨터학과 교수

참고문헌

자료제공 : 네이버학술정보

함께 이용한 논문

※ 기관로그인 시 무료 이용이 가능합니다.

4,000원

0개의 논문이 장바구니에 담겼습니다.

earticle