원문정보
초록
영어
In this paper, we propose a new boosting algorithm for distributed databases. The main idea of the proposed method is to utilize the parallelism of the distributed databases to build an ensemble of classifiers. At each round of the algorithm, each site processes its own data locally, and calculates all needed information. A center site will collect information from all sites and build the global classifier, which is then a classifier in the ensemble. This global classifier is also used by each distributed site to compute required information for the next round. By epeating this process, an ensemble of classifiers, which is almost identical to the one built on the whole data, will be produced from the distributed databases. The experiments were erformed on 5 different datasets from the UCI repository [9]. The experimental results show that the accuracy of the proposed algorithm is almost equal to or higher than the accuracy when pplying boosting algorithm to the whole database.
목차
1. Introduction
2. Boosting algorithm
3. Distributed boosting algorithm
3.1. Training phase
3.2. Classification phase
3.3. Base classifier
4. Performance evaluation
5. Experiments
5.1. Datasets
5.2. Performance comparison
6. Conclusion
References