Replace Missing Values with EM algorithm based on GMM and Naïve Bayesian

Xi-Yu Zhou; Joon S. Lim

Replace Missing Values with EM algorithm based on GMM and Naïve Bayesian

원문정보

보안공학연구지원센터(IJSEIA) International Journal of Software Engineering and Its Applications Vol.8 No.5 2014.05 pp.177-188 SCOPUS

피인용수 : 0건 (자료제공 : 네이버학술정보)

초록

영어

In data mining applications, there are various kinds of missing values in experimental datasets. Non-substitution or inappropriate treatment of missing values has a high probability to cause a lot of warnings or errors. Besides, many classification algorithms are very sensitive to the missing values. Because of these, handling the missing values is an important phase in many classification or data mining task. This paper introduces traditional EM algorithm and disadvantage of the EM algorithm. We propose a new method to implement the missing values based on EM algorithm, which uses Naive Bayesian to improve the accuracy. We conclude by classifying seeds dataset and vertebral columns dataset and comparing the results to those obtained by applying two other missing value handling methods: the traditional EM algorithm and the non-substitution method. The experimental results prove a stable algorithm for improving the data classification accuracy on large datasets, which contain a lot of missing values.

Abstract
1. Introduction
  1.1. Traditional EM Algorithm
  1.2. EM algorithm with Naive Bayesian
  1.3. Code Implementation
2. Data Implement and Classification Result
  2.1. Data Implementation
  2.2. Implement the Missing Values
  2.3. Classification Results
3. Experimental Results
Acknowledgment
References

키워드

저자정보

Xi-Yu Zhou I.T. College Gachon University Seongnam, South Korea
Joon S. Lim I.T. College Gachon University Seongnam, South Korea

참고문헌

자료제공 : 네이버학술정보

함께 이용한 논문

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

0개의 논문이 장바구니에 담겼습니다.

earticle