초록 열기/닫기 버튼

The purpose of this paper is to evaluate the abductive reasoning capability of deep learning-based language models, with a specific focus on their ability to infer the meaning of tautological expressions. To achieve this, we conducted two experiments that required language models to understand the meaning of tautological expressions in different contexts. The first experiment was a binary classification task in which the models were presented with preceding sentences as contexts along with a tautological expression, and evaluated the adequacy of the tautological sentence. We employed three BERT-based models and assessed their capability based on the results of human evaluation. The second experiment involved having ChatGPT generate a coherent sentence that followed the preceding sentence and tautological expression. We categorized certain types of errors from the generated sentences that were deemed inappropriate. This study is significant in revealing the limitations of contemporary deep learning-based language models in terms of abductive reasoning and suggests a new task for standardized model evaluation.