원문정보
Recent Trend on Expansion of Dataset in Natural Language Processing
초록
영어
Deeplearning is applied to the field of natural language processing, and the learning model by encoder-decoder model is used. Neural network-based natural language processing techniques mostly use sequence-to-sequence models and train the model with supervised learning. The success of such end-to-end deep neural network depends on securing a large amount of learning data to train the model. It takes a lot of time and money to build a large I / O dataset. Recent studies are looking into ways to secure performance by expanding scarce datasets. In this paper, we describe the data set expansion methods, such as the denoising training method, transfer learning, and pre-learning BERT. We discuss the recent application of data set expansion to. the machine translation system
목차
I. 서론
II. 데이터 셋 확장 방법
2.1 디노이징 비지도학습으로 데이터셋의 확장
2.2 전이 학습 (Transfer Learning)
2.3 사전 학습
III. 기계번역에서의 데이터셋 확장사례
3.1 도메인 분리와 도메인 적응
3.2 다언어 기계번역 시스템
3.3 비지도학습에 의한 훈련방법
IV. 결론
참고문헌