원문정보
초록
영어
Discovery and subsequent effective retrieval of useful user generated content depends on proper meta-data annotation implemented on an object such as a title and Keywords. In this study, a simpler unsupervised non graph-based algorithm for extracting keywords is proposed. A novel key phrases chunking approach was adopted; this utilizes words sequences as they appear in the original document. The simple but effective Term frequency-inverse document frequency (tf-idf) weighting scheme was exploited to rank the novelty created key-phrases. Comparing to a similar algorithm that uses three metrics weighting scheme, the tf-idf yielded a precision of 89%.Thus, the application of tf-idf algorithm on YouTube’s metadata based keywords shows to be useful approach in its objectivity.
목차
1. Introduction
2. Related Work
2.1. Machine Learning Approaches
2.2. Key Techniques
2.3. TF-IDF for Single Document
3. The Algorithm Description
4. Experimental Result
5. Extension to YouTube Videos
6. Discussion
7. Conclusion
References