

Exploiting YouTube’s Video ASR Scripts to Extend Educational Videos Textual Representative Tags Based on Gibb’s Sampling Technique



Given the importance of the textual information in content retrieval, it is desirable that the textual representation of educational videos contents in social media platforms like YouTube capture the semantics of what is really in content they represent. Such coherent textual representations are important in objective video content retrieval, repurposing, reuse and sense- making of the content. In this study,the Automatic Speech Recognition (ASR) in the video tracks was leveraged to supplement the insufficient video content representations done through video title alone. The Latent Dirichlet allocation (LDA) implementation of Gibb’s sampling topic modeling approach was used to evaluate the suitability of various textual representations for YouTube educational videos and extract the candidate topic that extends well the original YouTube keywords. The results show that in topics space, YouTube ASR script performs well as a representative textual source in dominant topic than the combined textual representations. The automatic keywords extension obtained using our method add value to applications that use tags for content discovery or retrieval


 1. Introduction
 2. Related Work
 3. Method Description
 4. Experimental Result
  4.1 Textual Representations
  4.2 The Algorithm for Extending the YouTube Keywords
 5. Conclusion


  • Ambele Robert Mtafya Central South University, Changsha, Hunan, China; Dar es salaam Institute of Technology, Tanzania
  • Dongjun Huang Central South University, Changsha, Hunan, China
  • Gaudence Uwamahoro Central South University, Changsha, Hunan, China


자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

      0개의 논문이 장바구니에 담겼습니다.