원문정보
초록
영어
The law is the most powerful weapon to safeguard national stability and ensure flourishing of all causes as well as an instrument to protect the rights and interests of the masses, therefore, the just and accurate use of legal provisions is of crucial importance. With the increase of informatization level of the legal industry itself, more and more services in legal affairs are provided in information-based forms, and there is a large number of unstructured information in such business data. It’s a problem the legal industry needs to tackle in information management to rapidly acquire valuable content from the massive unstructured data and make use of such content. Based on analysis of problems arising in existing laws and regulations informatization system, this paper comes up with solution of a legal affairs information service platform based on cloud computing, UIMA, semantics, big data and Chinese word segmentation. This paper also proposes the four-layer technical framework solution on the basis of the design of integration and management method of unstructured data of heterology and isomerism, analyzing and processing method of unstructured data, semantics based unstructured information retrieval method and construction and maintenance method of ontology library. It also provides detailed introduction to the realization of the combination of big data and cloud computing and its application in this information platform by virtue of designing UFS - a distributed file system, MapReduce - a batch processing technology and BigTable - a distributed database. Data acquisition and expanded data analysis can be conducted by making use of the expandable UIMA framework and the sequential indexing of data content and analysis results can be materialized by means of applying Lucene indexing technology. With regard to information retrieval, the concept of ontology is introduced on the basis of traditional search model and a new search model based on domain ontology is proposed. IKAnalyzer 3.x is proposed to facilitate Chinese word segmentation. By taking advantage of such information service platform, legal affairs enterprises can effectively integrate structured and unstructured information resources and implement the storage, analysis, retrieval and decision-making applications of business data content.
목차
1. Introduction
2. Problems of Existing Laws and Regulations Retrieval System Introduction
3. Current Research at Home and Abroad
4. System Design
5. Application of Big Data Technology in the Information Service Platform
6. UIMA and its Application
7. Word Segmentation and its Application
8. Conclusion
Acknowledgement
References