원문정보
초록
영어
Classification is a data mining technique widely used in critical domains like financial risk analysis, biology, communication network management, etc. Classification accuracy and learning from distributed datasets are the most challenging topics in the field of supervised learning. In this paper, we first briefly review the background of parallel and distributed classification algorithms and then propose a novel approach for classification in distributed large datasets. This approach is based on code migration instead of data migration. Extensive experimental results using a popular benchmark test suite show the effectiveness of this approach in term of accuracy. These results show also that the proposed method improved slightly classification accuracy over standard methods.
목차
1. Introduction
2. Parallel and Distributed Classification Algorithms
2.1. Parallel Decision Trees
2.2. Parallel Artificial Neural Networks
2.3. Parallel Boosting
2.4. Meta-learning
3. Our Proposed Method
3.1. Position of the Problem
3.2. Methodology
3.3. Distributed Learning Algorithm
3.4. Reduction of Communication Cost
4. Experimental Evaluation
5. Conclusion
References