원문정보
초록
영어
During the last decades, Due to the advances in Information technology and communication and increase in volume of printed documents in many applications, document image databases have become increasingly important. Document Images are documents that normally begin on paper and are then via electronics scanned that move towards a paperless office and stored documents as images. Document Image retrieval is one of an important research area in the field of document image databases. Many approaches come in for indexing and retrieval document images. Traditionally, Optical character recognition (OCR) has been used for completely convert the manuscript to an electronic version which can be indexed automatically. Then, Keyword spotting has been proposed for indexing document image retrieval. Keyword spotting method has lower cost than OCR. But there are some problems in both of methods for indexing document images with non-text components. Three approaches have been presented to solve this problem, Signature based approach, layout structural and logo based approach. In this paper we proposed a framework for classify document image retrieval approaches, and then we evaluated these approaches based on important measures.
목차
1. Introduction
2. Document Image Retrieval
2.1. Query Image
2.2. Noise Removal
2.3. Feature Extraction
2.4. Matching Algorithm
2.5. Indexed Documents
3. Evaluation Metrics for Evaluation Document Image RetrievalPerformance
4. Proposed Framework for classify Document Image Indexing Approach
4.1. Document Image Retrieval with Optical Character Recognition
4.2. Document Image Retrieval Based On Keyword Spotting
4.3. Document Image Retrieval Based on Layout Structural Similarity
4.4. Signature Based Document Image Retrieval
4.5. Document Image Retrieval Based On Logo Matching
5. Evaluation of Document Image Retrieval Approaches
5. Conclusion
References