Short note on Mining Data Streams.
Mining Data Streams
Stream data refer to data that flows into a system in vast volumes, change dynamically, are possibly infinite, and contain multidimensional features. Such data cannot be stored in traditional database systems. Moreover, most systems may only be able to read the stream once in sequential order. This poses great challenges for the effective mining of stream data. Substantial research has led to progress in the development of efficient methods for mining data streams, in the areas of mining frequent and sequential om patterns, multidimensional analysis (e.g., the construction of stream cubes), classification, clustering, outlier analysis, and online detection of rare events in data streams. The general philosophy is to develop single-scan or a-few-scan algorithms using limited computing and storage capabilities.
This includes collecting information about stream data in sliding windows or tilted time windows (where the most recent data are registered at the finest granularity and the more distant data are registered at a coarser granularity), and exploring techniques like micro clustering, limited aggregation, and approximation. Many applications of stream data mining can be an explored-for example, real-time detection of anomalies in computer network traffic, botnets, text streams, video streams, power-grid flows, web searches, sensor networks, and cyber-physical systems.
Comments
Post a Comment