

An Analysis of the Errors in the Auto-Generated Captions of University Commencement Speeches on YouTube


Jeong-Hwa Lee, Kyung-Whan Cha

피인용수 : 0(자료제공 : 네이버학술정보)



Auto-generated captions on YouTube have proven useful in helping viewers better understand the words being spoken. However, at times they fail to contain accurate captions. In these cases, they lead to confusion. The aim of this paper is to identify and analyze errors in the auto-generated captions of 20 commencement speeches on YouTube. These speeches were presented over a period of 12 years by speakers from different walks of life. The researchers selected ten male and ten female icons. Only the first 10 minutes of the speeches were utilized for this investigation. All the captioned errors were collected and analyzed. Upon completion of the analysis, it was discovered that the frequency of errors in each speech ranged between 10 and 46 cases, with an average of one error occurring about every 26 seconds. Among the different error categories, nouns record the highest number with 144 cases (31.3%). The second is verbs with 93 cases (20.2%), then prepositions with 37 cases (8.1%). Among the four subcategories, namely omission, addition, substitution, and word order, substitution recorded the highest amount of errors with 357 cases (77.6%). Furthermore, the errors were classified into two major groups. The first, involving function words, appeared in 169 cases (36.7%). The second, involving content words, appeared in 291 cases (63.3%). The results of this research suggest that a continuous development of the voice recognition software that automatically generates captions is necessary for more efficient and accurate data that will help viewers and listeners better comprehend the video contents.


Literature Review
Auto-generated Caption Errors
Machine Translation Errors
Data Collection
Data Analysis
Auto–Generated Caption Errors Based on 10 Categories and Four Sub-Categories
Function Word and Content Word Errors
Frequency Rates of Auto-generated Caption Errors as Recorded from the 20 Commencement Speeches
Discussion and Implication
Relating to the 10 Categories and Four Sub-Categories
Relating to Function Words and Content Words
Relating to the Frequency Rates of Each of the 20 Commencement Speeches
Summary and Limitations
The Authors
Appendix A
Appendix B


  • Jeong-Hwa Lee Hansung University, Korea
  • Kyung-Whan Cha Chung-Ang University, Korea


자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 기관로그인 시 무료 이용이 가능합니다.

      • 5,100원

      0개의 논문이 장바구니에 담겼습니다.