Abstract:
s opposed to static and finite data sets, data streams are voluminous,
possibly infinite and changing and contain continuous data. Mining is a process of
extracting interesting patterns from data. Data streams’ mining is popular as it
provides solution to many real life problems.
Extending traditional algorithms to data streams is difficult because
frequent pattern mining is a blocking operation. Frequent pattern mining methods are
used for trend analysis and in many other techniques e.g. association rule mining,
sequential pattern mining, structured pattern mining, iceberg cube computation, cube
gradient analysis, associative classification and frequent pattern-based clustering.
We propose S-Patt, a framework for maintaining patterns over a data
streams. We study the problem of maintaining frequent as well as infrequent patterns.
S-Patt summarizes the frequent and infrequent pattern information in a data structure.
It operates on transaction level granularity and thus requires low memory. It requires
only a single scan over the data. Our technique dynamically updates the pattern data
structure according to the patterns encountered.
The framework is implemented in java as a data stream mining
application “Stream Xplorer”. The application performs itemset mining on simulated
stream. It reports the analysis results to the user through different graphs.
The framework can be extended to sequential pattern analysis and to
other clustering and classification algorithms.