원문정보
Automatic Evaluation and Translation Techniques in Korean–Japanese AI Translation - A Comparative Study with Human Translation -
초록
영어
The present study aims to compare the translation quality of human translation and AI translation-specifically neural machine translation (NMT) and large language model (LLM)-based translation—in the context of Korean–Japanese translation. The source texts comprised informative texts (news articles) and expressive texts (columns), while the target texts included professional human translations as well as AI translations generated by Google, Papago, DeepL, and ChatGPT in 2023 and 2025. Translation quality was assessed using BLEU and TER metrics computed with SacreBLEU to ensure reproducibility, and complemented by a qualitative analysis employing Molina & Hurtado Albir’s(2002)translation techniques at the sentence level. The results revealed that DeepL achieved the highest BLEU and TER scores, whereas ChatGPT employed a broader range of techniques and produced translations most comparable to human translations. These findings highlight the limitations of automatic metrics and demonstrate the importance of combining quantitative metrics with qualitative and human evaluation to achieve a more comprehensive understanding of AI translation performance.
목차
2. 先行研究
2.1 機械翻訳
2.2 自動評価
2.3 翻訳テクニック
3. 分析
3.1 データの概要
3.2 自動評価の結果
3.3 翻訳テクニックの分析
4. おわりに
【参考文献】
