earticle

논문검색

Application of a Mining Algorithm to Finding Frequent Patterns in a Text Corpus: A Case Study of the Arabic

초록

영어

Information repositories containing text data of different languages are abundant on the World Wide Web. Digital corpora of sacred text of Islam related to Quran containing Arabic language are also publicly available. The availability of these corpora and intelligent application to analyze them are vital to better comprehend the religious text of Islam. In this paper I propose a method of representing the Quranic text corpus as a graph, and apply a frequent sub-path mining algorithm on it to generate frequent patterns. I have explained how the resulting frequent patterns can be used for subjective indexing and clustering similar verses of Quran.

목차

Abstract
 1. Introduction
  1.1. Problem Statement
  1.2. Proposed Solution
  1.2. Significance of Mining Frequent Patterns
  1.4. Related Work
 2. AFS Algorithm
  2.1. Frequent 0-Subpaths
  2.2. Candidate Generation
  2.3. Support Checking
 3. Experimental Results
  3.1. Grid Graph
  3.2. Complete Graph
 4. AFS Application to Quranic Arabic Text
 5. Results
  5.1. Subject Index
  5.2. Verse Similarity
 6. Conclusion and Future Work
 References

저자정보

  • Imran Ali Computer Science & Information Management Program, Asian Institute of Technology

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

      0개의 논문이 장바구니에 담겼습니다.