원문정보
초록
영어
Query subtopic mining aims to find aspects to represent people’s potential intents for a query. Clustering query reformulations is the most common approach for subtopic mining these days. However, there are some challenges that the existing approaches have to face in finding both relevant and diverse subtopics, such as term mismatch and data sparseness. In this paper, a novel semantic representations for query subtopics is introduced, which including phrase embedding representation and query category distributional representation, to solve those problems mentioned above. Furthermore, we also combine multiple semantic representations into vector space model and compute a joint similarity for clustering query reformulations. To evaluate our theory an experiment is conducted on a public dataset offered by NTCIR subtopic mining project, the experimental results show that phrase embedding representation is the most effective representation while combining multiple semantics benefits short text clustering and improves the performance of query subtopic mining.
목차
1. Introduction
2. Related Work
2.1 Query Subtopic Mining
2.2 Short Text Processing
3. Proposed Method
3.1 Framework
3.2 Aspect Phrase Extraction
3.3 Aspect Phrase Representation
3.4 Subtopic Generation
4. Experimental Setup
4.1 Dataset
4.2 Evaluating Metrics
4.3 Query Classification
4.4 Baselines
5. Experiment Results Analysis
5.1 Comparison of Semantic Representations
5.2 Comparison of Semantic Composition Methods
5.3 Combination of Semantic Representations
5.3 Combination of Semantic Representations
6. Conclusions and Future Work
References