earticle

논문검색

On Objective Keywords Extraction: Tf-Idf based Forward Words Pruning Algorithm for Keywords Extraction on YouTube

초록

영어

Discovery and subsequent effective retrieval of useful user generated content depends on proper meta-data annotation implemented on an object such as a title and Keywords. In this study, a simpler unsupervised non graph-based algorithm for extracting keywords is proposed. A novel key phrases chunking approach was adopted; this utilizes words sequences as they appear in the original document. The simple but effective Term frequency-inverse document frequency (tf-idf) weighting scheme was exploited to rank the novelty created key-phrases. Comparing to a similar algorithm that uses three metrics weighting scheme, the tf-idf yielded a precision of 89%.Thus, the application of tf-idf algorithm on YouTube’s metadata based keywords shows to be useful approach in its objectivity.

목차

Abstract
 1. Introduction
 2. Related Work
  2.1. Machine Learning Approaches
  2.2. Key Techniques
  2.3. TF-IDF for Single Document
 3. The Algorithm Description
 4. Experimental Result
 5. Extension to YouTube Videos
 6. Discussion
 7. Conclusion
 References

저자정보

  • Ambele Robert Mtafya Central South University, Changsha, Hunan, China; Dar es salaam Institute of Technology, Tanzania.
  • Dongjun Huang Central South University, Changsha, Hunan, China; Dar es salaam Institute of Technology, Tanzania.
  • Gaudence Uwamahoro Central South University, Changsha, Hunan, China; Dar es salaam Institute of Technology, Tanzania.

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

      0개의 논문이 장바구니에 담겼습니다.