원문정보
A Study on the Development of an LLM-Based Multi-Trait Essay Scoring Framework for EFL Learners
초록
영어
Essay scoring has garnered increasing attention in the field of EFL education due to its efficacy in assessing learners’ cognitive and communicative competencies. However, manually scoring essays often encounters significant bottlenecks, including a lack of scoring expertise, inconsistencies among raters, and substantial time demands. To tackle these challenges, this study proposes a Large Language Model-based Multi-Trait Essay Scoring (LMES) framework, which incorporates rubric-based criteria into the reasoning process of an LLM to generate textual justifications, referred to as rationales. These rationales subsequently guide a BERT language model in predicting trait-specific scores. Utilizing the Feedback Prize essay dataset, LMES demonstrates the effectiveness of LLM-generated rationales in enhancing alignment with human raters. Semantic similarity and hit-rate analyses indicate that the rationales closely reflect the rubric-criteria, particularly for meaning-focused traits. Additionally, the inclusion of rationales yields improved correlations between predicted and human-rated scores, and factor analysis of the predicted scores reveals a clear distinction between form- and meaning-focused traits. The findings suggest that rubric-based rationales enhance the validity and accuracy of automated essay scoring in EFL contexts.
목차
II. 이론적 배경 및 선행 연구
1. 자동화 에세이 평가(Automated Essay Scoring, AES)
2. 준거 참조 평가(Criterion-Referenced Assessment, CRA)
III. 연구 방법
1. LLM 기반 에세이 다면평가 프레임워크 설계
2. EFL 학습자 작성 에세이 데이터
3. LMES 모델 학습을 위한 데이터셋 분할 및 검증 전략
4. LEMS 모델 학습 환경 및 전략
5. 분석 방법
IV. 연구 결과 및 논의
1. LLM이 생성한 평가 근거와 평가 루브릭 간의 유사도를 활용한 내용 타당도 분석(연구문제 2)
2. LMES의 에세이 다면평가 결과 분석(연구 문제 3)
V. 결론
Works Cited
부록
Abstract
