Pubic Organization Unstructured Data Big Data Analysis of Major Analytical Techniques

웹과 소셜미디어의 활용이 활발해짐에 따라 온라인에서 생성되는 비정형데이터 역시 기하급수적으로 증가하고 있으며, 이러한 비정형데이터 중에서도 텍스트 데이터에 대한 분석이 다양한 분야에서 이루어지고 있다. 본 논문은 국방 및 공공기관에 행정정보 시스템 및 온나라 시스템에 파일로 축적된 아래한글, MS-WORD, 파워포인트 등 비정형데이터에 빅데이터 분석을 통한 기관에 맞는 빅데이터 정보를 제공하고 향후 미래를 예측 할 수 있는 자료를 추출하기 위해 비정형데이터 분석에서 주로 사용하는 텍스트 마이닝 기법의 방법과 절차 및 딥러닝 기반 텍스트 마이닝 기법인 워드투백의 특징 및 분석 절차에 대하여 연구 하였다.

As the usage of the web and social media becomes more active, the online data is increasing exponentially. Among these unstructured data, the analysis of text data is taking place in various fields. In this paper, we provide big data information for agencies by analyzing big data on unstructured data such as Hangul, MS-Word, and PowerPoint, which are accumulated in the administrative information system and onnara system in the defense and public institutions. In order to extract the predictable data, we used the text mining method and procedure which are mainly used in atypical data analysis and the word - to - back feature and analysis procedure which is a deep - run - based text mining technique.