论文标题
考虑多次出现的频繁串行发作的增量挖掘
Incremental Mining of Frequent Serial Episodes Considering Multiple Occurrences
论文作者
论文摘要
从流中分析信息的需求是在各种应用中出现的。它的基本研究方向之一是在数据流上开采顺序模式。当前的研究根据交易中的模式存在一系列项目系列,但不关注一系列项目集及其多次出现。但是,项目集流窗口及其多次发生的模式提供了额外的能力,可以识别模式的基本特征以及在现有基于存在的研究中无法识别的模式的基本特征。在本文中,我们研究了这样一个新的顺序模式挖掘问题,并提出了一个具有新型策略的相应顺序矿工,以有效地修剪搜索空间。对真实数据和合成数据的实验都显示了我们方法的实用性。
The need to analyze information from streams arises in a variety of applications. One of its fundamental research directions is to mine sequential patterns over data streams. Current studies mine series of items based on the presence of the pattern in transactions but pay no attention to the series of itemsets and their multiple occurrences. The pattern over a window of itemsets stream and their multiple occurrences, however, provides additional capability to recognize the essential characteristics of the patterns and the inter-relationships among them that are unidentifiable by the existing presence-based studies. In this paper, we study such a new sequential pattern mining problem and propose a corresponding sequential miner with novel strategies to prune the search space efficiently. Experiments on both real and synthetic data show the utility of our approach.