자연어 처리 모델을 활용한 블록 코드 생성 및 추천 모델 개발

전인성; 송기상

자연어 처리 모델을 활용한 블록 코드 생성 및 추천 모델 개발

원문정보

Development of Block-based Code Generation and Recommendation Model Using Natural Language Processing Model

전인성, 송기상

한국정보교육학회 정보교육학회논문지 제26권 제3호 2022.06 pp.197-207 KCI 등재

피인용수 : 0건 (자료제공 : 네이버학술정보)

초록

영어

In this paper, we develop a machine learning based block code generation and recommendation model for the purpose of reducing cognitive load of learners during coding education that learns the learners block that has been made in the block programming environment using natural processing model and fine-tuning and then generates and recommends the selectable blocks for the next step. To develop the model, the training dataset was produced by pre-processing 50 block codes that were on the popular block programming language web site ‘Entry’. Also, after dividing the pre-processed blocks into training dataset, verification dataset and test dataset, we developed a model that generates block codes based on LSTM, Seq2Seq, and GPT-2 model. In the results of the performance evaluation of the developed model, GPT-2 showed a higher performance than the LSTM and Seq2Seq model in the BLEU and ROUGE scores which measure sentence similarity. The data results generated through the GPT-2 model, show that the performance was relatively similar in the BLEU and ROUGE scores except for the case where the number of blocks was 1 or 17.

한국어

본 논문에서는 코딩 학습 중 학습자의 인지 부하 감소를 목적으로 자연어 처리 모델을 이용하여 전이학습 및 미세조정을 통해 블록 프로그래밍 환경에서 이미 이루어진 학습자의 블록을 학습하여 학습자에게 다음 단계에서 선택가능한 블록을 생성하고 추천해주는 머신러닝 기반 블록 코드 생성 및 추천 모델을 개발하였다. 모델 개발을 위해 훈련용 데이터셋은 블록 프로그래밍 언어인 ‘엔트리’ 사이트의 인기 프로젝트 50개의 블록 코드를 전 처리하여 제작하였으며, 훈련 데이터셋과 검증 데이터셋 및 테스트 데이터셋으로 나누어 LSTM, Seq2Seq, GPT-2 모델을 기반으로 블록 코드를 생성하는 모델을 개발하였다. 개발된 모델의 성능 평가 결과, GPT-2가 LSTM과 Seq2Seq 모델보다 문장의 유사도를 측정하는 BLEU와 ROUGE 지표에서 더 높은 성능을 보였다. GPT-2 모델을 통해 실제 생성된 데이터를 확인한 결과 블록의 개수가 1개 또는 17개인 경우를 제외하면 BLEU와 ROUGE 점수에서 비교적 유사한 성능을 내는 것을 알 수 있었다.

요약
Abstract
1. 서론
2. 이론적 배경
2.1. 프로그래밍 교육에서의 학습자 피드백
2.2. 자연어 처리 기술
3. 연구방법
3.1. 데이터셋 구축
3.2. 데이터 전처리
3.3. 모델 개발 방법
3.4. 모델 평가
4. 연구결과
4.1. GPT-2 모델을 활용한 블록 생성
4.2. LSTM, Seq2Seq, GPT-2 모델의 성능 평가
5. 결론
참고문헌

키워드

저자정보

전인성 In-Seong Jeon. 한국교원대학교 컴퓨터교육과
송기상 Ki-Sang, Song. 한국교원대학교 컴퓨터교육과

참고문헌

자료제공 : 네이버학술정보

함께 이용한 논문

※ 기관로그인 시 무료 이용이 가능합니다.

4,200원

0개의 논문이 장바구니에 담겼습니다.

earticle