Word2Vec를 이용한 한국어 단어 군집화 기법

허지욱

Word2Vec를 이용한 한국어 단어 군집화 기법

원문정보

Korean Language Clustering using Word2Vec

허지욱

국제인공지능학회(구 한국인터넷방송통신학회) 한국인터넷방송통신학회 논문지 제18권 제5호 2018.10 pp.25-30 KCI 등재

피인용수 : 0건 (자료제공 : 네이버학술정보)

초록

영어

Recently with the development of Internet technology, a lot of research area such as retrieval and extracting data have getting important for providing the information efficiently and quickly. Especially, the technique of analyzing and finding the semantic similar words for given korean word such as compound words or generated newly is necessary because it is not easy to catch the meaning or semantic about them. To handle of this problem, word clustering is one of the technique which is grouping the similar words of given word. In this paper, we proposed the korean language clustering technique that clusters the similar words by embedding the words using Word2Vec from the given documents.

한국어

최근 인터넷의 발전과 함께 사용자들이 원하는 정보를 빠르게 획득하기 위해서는 효율적인 검색 결과를 제공해주는 정보검색이나 데이터 추출등과 같은 연구 분야에 대한 중요성이 점점 커지고 있다. 하지만 새롭게 생겨나는 한국어 단어나 유행어들은 의미파악하기가 어렵기 때문에 주어진 단어와 의미적으로 유사한 단어들을 찾아 분석하는 기법들에 대한 연구가 필요하다. 이를 해결하기 위한 방법 중 하나인 단어 군집화 기법은 문서에서 주어진 단어와 의미상 유사한 단어들을 찾아서 묶어주는 기법이다. 본 논문에서는 Word2Vec기법을 이용하여 주어진 한글 문서의 단어들을 임베딩하여 자동적으로 유사한 한국어 단어들을 군집화 하는 기법을 제안한다.

요약
Abstract
Ⅰ. 서론
Ⅱ. 관련연구
  1. 텍스트마이닝과 언어 군집화
  2. 단어 임베딩(Word Embedding)
Ⅲ. 3장 Word2Vec를 활용한 한국어단어 군집화 기법
  1. 단계 전처리 과정
  2. Word2vec를 이용한 단어 임베딩
  3. 군집화 작업
  4. 대표 단어 선정
Ⅳ. 실험 및 결과
  1. 실험 데이터 및 방법
  2. 실험 결과
Ⅴ. 결론
References

키워드

저자정보

허지욱 Jee-Uk Heu. 정회원, 한양대학교 컴퓨터공학과

참고문헌

자료제공 : 네이버학술정보

함께 이용한 논문

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

0개의 논문이 장바구니에 담겼습니다.

earticle