원문정보
Text-mining Based Graph Model for Keyword Extraction from Patent Documents
초록
영어
The increasing interests on patents have led many individuals and companies to apply for many patents in various areas. Applied patents are stored in the forms of electronic documents. The search and categorization for these documents are issues of major fields in data mining. Especially, the keyword extraction by which we retrieve the representative keywords is important. Most of techniques for it is based on vector space model. But this model is simply based on frequency of terms in documents, gives them weights based on their frequency and selects the keywords according to the order of weights. However, this model has the limit that it cannot reflect the relations between keywords. This paper proposes the advanced way to extract the more representative keywords by overcoming this limit. In this way, the proposed model firstly prepares the candidate set using the vector model, then makes the graph which represents the relation in the pair of candidate keywords in the set and selects the keywords based on this relationship graph.
목차
2 관련연구
2.1 텍스트 마이닝
2.2 벡터 공간 모델
2.3 그래프 기반 모델
3. 관계성 그래프 모델
3.1 후보 키워드군 추출
3.2 섹션별 후보 키워드군의 문장 내위치 정보 추출
3.3 관계성 기반 인접행렬
3.4 간선 제거에 의한 키워드 추출
4. 실험 및 평가
5. 결론
6. References