원문정보
초록
영어
This study examines the quality of machine translation post-editing (MTPE) for Korean-Chinese legislative texts and proposes an evaluation approach appropriate for legal translation. A total of 109 sentences from the Enforcement Decree of the Framework Act on Administrative Regulation were analyzed by comparing GPT-4-generated translations with MTPE outputs produced by Chinese-native doctoral students in translation studies. The study investigates the correlations between expert-based human evaluation and automatic metrics such as BLEU, METEOR, BERTScore, COMET, and HTER to assess whether these metrics adequately reflect MTPE quality improvement. The results show that automatic metrics correlate only moderately with human evaluation, with METEOR exhibiting the highest correlation. BLEU revealed structural limitations in capturing semantic accuracy in legislative texts. Additionally, HTER showed negative correlations with both human and automatic evaluation, indicating that greater editing effort does not necessarily improve translation quality. These findings suggest that the structural characteristics of legislative texts and the limited legal expertise of MTPE performers influence evaluation outcomes. The study concludes that a multifaceted evaluation framework combining automatic and human assessments is necessary to ensure reliable MTPE quality evaluation for Korean-Chinese legislative translation. The results provide empirical evidence for refining assessment models tailored to the genre-specific features of legislative texts.
목차
I. 서론
II. 이론적 배경
1. 자동평가 지표
2. 인간평가 모델
III. 연구방법
IV. 분석과 논의
1. 자동평가의 결과 분석
2. 인간평가의 결과 분석
3. 통계 분석
V. 결론
참고문헌
