빅 데이터의 효율적 군집화를 위한 알고리즘에 대한 비교 분석

이순근; 임영문

빅 데이터의 효율적 군집화를 위한 알고리즘에 대한 비교 분석

원문정보

Comparing and Analyzing on Algorithms for the Effective Clustering of Big Data

이순근, 임영문

대한안전경영과학회 대한안전경영과학회 학술대회논문집 2014년 대한안전경영과학회 춘계학술대회 2014.04 pp.495-500

피인용수 : 0건 (자료제공 : 네이버학술정보)

초록

영어

As Internet has been wildly spreaded and it's technique is advanced, the use of computers has been routinized and almost data are stored in computers. Accordingly, many companies and researchers have tried to find the relations in these tremendous data and the one way is to use clustering algorithm which is used to find out similar data set in the entire data set and to discover the common properties. In early period, clustering algorithm was performed based on a main memory of a computer and PAM(Partitioning Around Medoids) was representative, which can be complemented k-means algorithm defeat. PAM performs clustering by using the medoid of data instead of means. PAM works well in small data set but it is difficult to apply it to large data set. Therefore, CLARA(Clutering LARge Application) shows up to be used in large data set. This algorithm samples data from large data set and applies PAM to the sample data. CLARA has limits caused by the fixed samples in each clustering stage and has a problem that if the good mediod is not sampled then the result of the clustering becomes not good. CLARANS(Clustering Large Application based upon Randomized Search) overcomes these problems by drawing a sample with some randomness. This algorithm executes clustering using k mediod set extracted in the processing of clustering in each stage. The main objective is to compare and analyze the algorithms which are popularly used for the clustering of big data.

저자정보

이순근 강릉원주대학교 산업정보경영공학과
임영문 강릉원주대학교 산업정보경영공학과

참고문헌

자료제공 : 네이버학술정보

함께 이용한 논문

※ 기관로그인 시 무료 이용이 가능합니다.

4,000원

0개의 논문이 장바구니에 담겼습니다.

earticle