원문정보
초록
영어
Finding research papers about particular topic of study is the most time consuming activity for many people including students, professors and researchers. People doing research have to search, read and analyze multiple research papers, e-books and other documents and then determine what they contain and discover knowledge from them. Huge available resources are in the form of unstructured texts format of long text pages which require a long time to process, search, read and analyze. Organizing research papers in their respective subjects or topics can facilitate the search process. We propose a new method to research paper organization and retrieval that is amenable to closely research papers and intertwined research topics. With our centroid and relationship based clustering approach, research papers are arranged and grouped within the most probable research topics or subjects. To determine topic membership, the proposed approach considers relationships such as common terms in paper title, in keywords, in referenced titles and common terms in the top frequent sentences. To solve the high dimensional problem associated with text document, only most important information of the paper is considered and we leverage on multi-word and frequent occurring phrases as the features in clustering process. Conducted experiments show that our approach is effective.
목차
1. Introduction
2. Related works
3. Centroid and Relationship based Clustering Method
3.1. Information extraction
3.2. Paper relationship
3.3. Multi word features
3.4. Centroid based clustering
4. Feature Extraction and Initial Centroids Selection
4.1 Preprocessing
4.2 Important information extraction or Feature selection
4.3 Phrase extraction
4.4 Initial Centroid selection
5. Papers Clustering
6. Evaluation of the Proposed Approach
7. Conclusion
References