earticle

논문검색

A Time-Enhanced Topic Clustering Approach for News Web Search

초록

영어

Time is an important dimension of information space. It plays important roles in Web search, because most Web pages contain time information and many Web queries are time-related. Therefore, exploiting temporal information in Web pages has been a hotspot in the research on Web search. In this paper, we focus on the time-enhanced topic clustering issue for news search results. Traditional clustering algorithms are usually based on the common phrases of Web pages, and they have little consideration about using the temporal information of Web pages. From this perspective, we propose a time-enhanced topic clustering algorithm for news Web pages. It improves traditional algorithms which only consider textual clustering, and applies a temporal clustering procedure on the topics returned by a textual clustering algorithm, which is to arrange every Web page in a cluster along a timeline based on the update time in Web pages. We conduct experiments on a real dataset crawled from Google News, and compare our algorithm with other competitors including K-Means, STC, TFIC, and Minhash Clustering in terms of different metrics such as precision and recall. The experimental results show that the proposed algorithm has better performance under both offline and online clustering test.

목차

Abstract
 1. Introduction
 2. Related Work
 3. Time-Enhanced Topic Clustering
  3.1 Offline Clustering
  3.2 Online Clustering
  3.3 Time-Based Clustering
 4. Performance Evaluation
  4.1 Dataset
  4.2 Results
 5. Conclusions
 Acknowledgements
 References

저자정보

  • Jie Zhao School of Business, Anhui University, School of Management, University of Science and Technology of China
  • Xiaowen Li School of Computer Science and Technology, University of Science and Technology of China
  • Peiquan Jin School of Computer Science and Technology, University of Science and Technology of China

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

      0개의 논문이 장바구니에 담겼습니다.