원문정보
초록
영어
A large number of emerging information networks brings new challenges to the community detection. The meaningful community should be topic-oriented. However, the topology-based methods only reflect the strength of connection, and ignore the consistency of the topics; the content-based methods focus on the contents and completely ignore the links. This paper explores a topic oriented community detection method simLPA based on label propagation for information work. The method utilizes Latent Dirichlet Allocation topic model to represent the node content, and calculate the content similarity by the normalized Kullback–Leibler divergence. simLPA extended by LabelRank fuses the links and the contents naturally to detect the topic community. Extensive experiments on nine real-world datasets with varying sizes and characteristics validate the proposed method outperforms other baseline algorithms in quality. Additionally simLPA integrated into the content is equivalent to LabelRank in efficiency, which is easy to handle large-scale information networks.
목차
1. Introduction
2. Related Works
3. Methodology
3.1. The Content Representation
3.2 simLPA Algorithm
3.3 Performance Metrics
4. Experiments
4.1 Datasets
4.2 Effect of Inflation Factor
4.3 The Impact of the Different Label Propagation Strategy
4.4 Running Time
4.5 Community Detection Results
5. Conclusions
References
