원문정보
초록
영어
Extracting useful knowledge from data sets measuring in gigabytes and even terabytes is a challenging research area for the data mining community. Sequential approaches suffer from a performance problem due to the fact that they have to mine voluminous databases. Parallelism is introduced as an important solution that could improve the response time and the scalability of these approaches. However, parallelization process is not trivial and still facing many challenges including the workload balancing problem.
In this paper, we propose a hierarchical dynamic load balancing strategy for parallel association rule mining algorithms in the context of a Grid computing environment. The French research grid “Grid’5000” is used as our experimental test-bed. Through a detailed experimental study, we show that our strategy improves the performance and helps the parallel algorithm to scale very well with the number of computational nodes available.
목차
1. Introduction
2. Mining Association Rules
3. Parallel and Distributed Mining of Association Rules
4. Load Balancing and Data Mining
5. The Hierarchical Grid Model
5.1. The 1/N Model
5.2. The C/N Model
6. Characteristics of the Proposed Model
7. The Hierarchical Workload Balancing Strategy
7.1. The Workload Balancing Algorithms
8. Performance Evaluation
8.1. Parallelization Approach
8.2. Experimental Platform
8.3. Experiments
9. Discussion
10. Conclusion
Acknowledgements
References