원문정보
초록
영어
The discovery process of data mining concerns an automatic extraction of interesting patterns and correlations from a large database. These patterns can reveal implicit relationships among set of objects that lead to the generation of actionable rules to be used for financial forecast, medical diagnosis, and many other useful applications. Current studies in data mining and genetic computing concentrate on how to effectively find all objects frequently co-occurring or correlated. For a massive database, parallel method is a solution for the scalability problem. In this paper, we present the design of parallel methods to the genetic algorithms, clustering, and association mining tasks. The implementation of the proposed method is based on the concurrent functional programming paradigm using the Erlang language that handles parallelism via a message passing mechanism. We test our implementations on the synthetic data sets and the real genetic data. The results show a good runtime improvement.
목차
1. Introduction
2. Concurrent implementation methods
2.1. Concurrent genetic algorithms
2.2. Concurrent clustering
2.3. Concurrent association mining applied to the splice site recognition problem
3. Performance Study Results
3.1. Performance of concurrent genetic algorithms
3.2. Performance of concurrent clustering
3.3. Performance of concurrent association mining
4. Conclusion
Acknowledgements
References