원문정보
초록
영어
Mining association rules refers to extracting useful knowledge from large databases. Algo- rithms of this technique are both data and computation-intensive, which make grid platforms very attractive for them. However, to exploit these platforms, new data partitioning features are required where the specicities of both association rule mining technique and grids must be taken into consideration. In this paper, we propose a novel data partitioning approach for distributed association rule mining algorithms in the context of a grid computing environment. We conduct exper- iments using the French research grid "Grid'5000". Experimental results conrm that our data partitioning approach is very sucient for balancing the load when homogeneous clus- ters are used. For heterogeneous clusters, the proposed data partitioning approach constitute the preprocessing phase of the process of dynamic load balancing of distributed association rule mining.
목차
1 Introduction
2 State of the art
2.1 Basic concepts
2.2 Serial algorithms
2.3 Grid-based association rule mining
3 The need of load balancing for grid-based association rule min-ing
4 A novel data partitioning approach for distributed ARM algo-rithms
4.1 Variant 1: Fair partitioning
4.2 Variant 2: Homogeneous-contents-based partitioning
4.3 Variant 3: Contents-capacity-based partitioning
5 Experimental evaluation
5.1 Testing variant 1: Fair partitioning
5.2 Testing variant 2: Homogeneous-contents-based partitioning
5.3 Testing variant 3: Contents-capacity-based partitioning
5.4 Related works
6 Conclusion
ACKNOWLEDGMENTS
References