원문정보
초록
영어
Finding and obtaining information eciently from the Web is one of the important ele- ments in realizing Smart Home environment. Users expect to nd most relevant information within the shortest possible time. In this paper, we investigate the similarity of Web pages within Strongly Connected Components (SCCs). SCCs are overlapping groups of Web pages that may imply a relationship between the Web pages of the same component. Therefore, we seek to trace the similarity of these groups of Web pages using Cosine Similarity. Our experiment performed on Malaysian Web pages indicates that Web pages within same SCC carry a common topic or theme. This nding proves that we may locate Web pages with similar topic using the hyperlinks structure, without performing expensive analysis on the contents of the Web pages.
목차
1: Introduction
2: Related Work
3: Methodology
4: Results
4.1: The Dataset
4.2: Distribution of SCC Sizes
4.3: Distribution of SCC Sizes
4.4: Average Cosine Similarity Score
5: Conclusion and Future Work
References
