



Random Blocking Text Retrieval Algorithm Based on Latent Semantic Analysis


赵亚慧, 金小峰, 崔荣一



A fast text retrieval algorithm using the idea of random blocking for massive‐content text based on
Latent Semantic Analysis is proposed in this paper. Firstly, by fully considering the correlation between terms, retrieve and massive‐content text are represented in lower‐dimensional space and the model is improved using the way of singular value decomposition. Secondly, a random blocking query method is used for the retrieval of paragraphs which take the cosine similarity as the fitness function between the retrieve and massive‐content text and then the candidate paragraphs are output when there similarity value are higher than threshold. Experiments show that the proposed method has high performance in text retrieval by considering the semantic information fully and can achieve text retrieval quickly.




 0. 引言
 1. 设计方案
 2. 关键技术的实现
  2.1 分词
  2.2 文本表示
  2.3 潜在语义索引LSI)与奇异值分解方法(SVD)
 3. 基于潜在语义分析的随机分块文本检索算法
 4. 实验结果及分析
  4.1 实验过程
  4.2 实验结果评估指标
  4.3 结果与分析
 5. 结束语


  • 赵亚慧 조아혜. China 133002 延吉 延边大学工学院计算机科学与技术系 智能信息处理研究室
  • 金小峰 김소봉. China 133002 延吉 延边大学工学院计算机科学与技术系 智能信息处理研究室
  • 崔荣一 최영일. China 133002 延吉 延边大学工学院计算机科学与技术系 智能信息处理研究室


