원문정보
초록
영어
Streaming data are potentially infinite sequence of incoming data at very high speed and may evolve over the time. This causes several challenges in mining large scale high speed data streams in real time. Hence, this field has gained a lot of attention of researchers in previous years. This paper discusses various challenges associated with mining such data streams. Several available stream data mining algorithms of classification and clustering are specified along with their key features and significance. Also, the significant performance evaluation measures relevant in streaming data classification and clustering are explained and their comparative significance is discussed. The paper illustrates various streaming data computation platforms that are developed and discusses each of them chronologically along with their major capabilities. This paper clearly specifies the potential research directions open in high speed large scale data stream mining from algorithmic, evolving nature and performance evaluation measurement point of view. Finally, Massive Online Analysis (MOA) framework is used as a use case to show the result of key streaming data classification and clustering algorithms on the sample benchmark dataset and their performances are critically compared and analyzed based on the performance evaluation parameters specific to streaming data mining.
목차
1. Introduction
2. Dimensions of Stream Data Mining
2.1. Stream Data Mining Issues and Challenges
2.2. Stream Data Mining Algorithms
3. Recent Trends and Future Perspective
3.1. From Algorithms Development Point of View
3.2. From New Evaluation Measures Point of View
3.3. From Concept Change Identification Point of View
4. Result and Discussions: MOA Use Case
4.1. Streaming Data Classification
4.2. Streaming Data Clustering
5. Conclusion
References