K-Means Clustering of Shakespeare Sonnets with Selected Features

T. Senthil Selvi; R. Parimala

K-Means Clustering of Shakespeare Sonnets with Selected Features

원문정보

T. Senthil Selvi, R. Parimala

보안공학연구지원센터(IJDTA) International Journal of Database Theory and Application Vol.9 No.8 2016.08 pp.89-98 SCOPUS

피인용수 : 0건 (자료제공 : 네이버학술정보)

초록

영어

This paper focuses on clustering the lines of Shakespeare Sonnets. Sonnet Line Clustering (SLC) is the task of grouping a set of lines in such a way that lines in the same cluster are more similar to each other than to those in other clusters. K-Means clustering is a very effective clustering technique well known for its observed speed and its simplicity. Its aim is to find the best division of N lines into K groups (clusters), so that the total distance between the groups’s members and corresponding centroid, is minimized. A new algorithm Sonnet Line Clustering with Random Feature Selection SLCRFS is proposed. To validate the process external validation or internal validation is done. Since, internal validation has no considerable impact in conducting research this work concentrates on the measures of external validation. Entropy and Purity are frequently used external measures of validation for K-Means. The proposed approach uses entropy as performance measure. The clusters formed are evaluated and interpreted according to the Euclidean measure between data points and cluster centers of each cluster. This paper concludes with an analysis of the results of using the proposed measure to display the clustered sonnets using K-Means algorithm with minimum entropy for different feature sets.

키워드

저자정보

T. Senthil Selvi Research Scholar, PG & Research Department of Computer Science, Periyar E. V. R. College, Tiruchirapalli
R. Parimala Research Adviser, Assistant Professor, PG & Research Department of Computer Science, Periyar E. V. R. College, Tiruchirapalli-23

참고문헌

자료제공 : 네이버학술정보

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

0개의 논문이 장바구니에 담겼습니다.

earticle