An Improved Classification Course Based on Mapreduce

Haitao Wang; Shunfeng Liu; Zongpu Jia

An Improved Classification Course Based on Mapreduce

원문정보

Haitao Wang, Shunfeng Liu, Zongpu Jia

보안공학연구지원센터(IJGDC) International Journal of Grid and Distributed Computing Vol.8 No.3 2015.06 pp.43-52

피인용수 : 0건 (자료제공 : 네이버학술정보)

초록

영어

It is an importance step for near-duplication detection to perform file classification in the data mining field, in this paper an improved classification course is proposed which consists of training and test course corresponding to its algorithm respectively. It utilizes the MapReduce computing model created by Google to conduct the classification calculation. Specially, the Sogou news data with various data amounts which simulated the massive data set was used for testing effectiveness and a comparative evaluation on execution time and speedup was accomplished on the experimental circumstance. The results obtained shows that the classification course obviously reduces the execution times greatly and gains the ideal speedup ratio when increasing data amounts, achieves the better performance.

키워드

저자정보

Haitao Wang School of Computer Science and Technology Jilin University, QianJin Street, ChangChun, JiLin, China,Henan Polytechnic University Shiji Street, Jiaozuo, Henan, China
Shunfeng Liu School of Computer Science and Technology Jilin University, QianJin Street, ChangChun, JiLin, China
Zongpu Jia Henan Polytechnic University Shiji Street, Jiaozuo, Henan, China

참고문헌

자료제공 : 네이버학술정보

함께 이용한 논문

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

0개의 논문이 장바구니에 담겼습니다.

earticle