earticle

논문검색

An Improved ID3 Decision Tree Algorithm on Imbalance Datasets Using Strategic Oversampling

초록

영어

Data mining is the process of extracting useful information from the vast and complex databases. In real time scenario the data sources contain many varied data including imbalance data category. Imbalance data sets contain more percentage of instances from one class and are very less percentage of instances from other class. The traditional decision tree algorithm called Iterative Dichotomiser 3 (ID3) is built for not handling the imbalance datasets. To overcome the drawback of ID3 on imbalance datasets, an improved algorithms are needed. In this paper, propose extension of ID3 algorithm called Over Sampled ID3 (OSID3) for imbalance data learning. The proposed OSID3 approach uses the oversampling technique with unique statistical oversample strategy for removing less privileged instances in the early stage and later on oversampling the high privileged instances for approximate data balance. The experimental observation suggests that the proposed approach improves in terms of Accuracy, Area Under Curve (AUC) and Root Mean Square Error (RMSE) with the benchmark ID3 on 15 imbalance datasets from University of California, Irvine (UCI) repository.

목차

Abstract
 1. Introduction
 2. Current Approaches in Decision Trees
 3. The Proposed Approach
 4. Investigational Design and Assessment Criteria
 5. Results
 6. Conclusion
 References

저자정보

  • L. Surya Prasanthi Research Scholar, Department of Computer Science, Krishna University, Machilipatnam, India
  • R. Kiran Kumar Department of Computer Science, Krishna University, Machilipatnam, India
  • Kudipudi Srinivas Department of Computer Science & Engineering, V.R. Siddartha Engineering College, Vijayawada, India

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

      0개의 논문이 장바구니에 담겼습니다.