

Text Recognition Algorithm Based on Text Features



It is difficult to realize the text watermarking algorithm on natural language, and the format of text watermarking algorithm has poor robustness against format attacks. This paper presents the new text recognition algorithm based on the text feature. The words are segmented and extracted according to the text feature. The feature dimensions are reduced with the technology of LSA and stop-words database. The new similarity method is also defined to determine the threshold in order to detect the watermarking. The experimental results indicate that the proposed algorithm has better operating efficiency and stronger robustness than the previous researches. This algorithm can also handle the text document written in both Chinese and English effectively.


 1. Introduction
 2. Relevant Knowledge
 3. Identification Algorithm for Text Feature
  3.1 Compute of Similarity
  3.2 Extraction and Recognition of Text Feature
 4. Experimental Results and the Analysis of the Performance
  4.1 Ascertain the Threshold
  4.2 Attack Experiments
 5. Conclusions


  • De Li Department of Computer Science, Yanbian University 133002, Yanji, China
  • XueZhe Jin Department of Computer Science, Yanbian University 133002, Yanji, China
  • LiHua Cui College of Economics and Management, Yanbian University 133002, Yanji, China


