通过有效的未来痕迹学习状态机器

论文标题

通过有效的未来痕迹学习状态机器

Learning state machines via efficient hashing of future traces

论文作者

Baumgartner, Robert, Verwer, Sicco

论文摘要

州机器是模型和可视化离散系统（例如软件系统）并代表常规语法的流行模型。大多数从数据中被动学习状态机的算法都假设所有数据从一开始就可用，并且它们将这些数据加载到内存中。这使得在处理大型数据集时，很难将它们应用于连续流数据并导致巨大的内存需求。在本文中，我们提出了一种使用Count-Min-Sketch数据结构来从数据流中学习状态机的方法，以减少内存需求。我们使用众所周知的红蓝色框架来应用状态合并以减少搜索空间。我们在一个既定的学习状态机器的框架中实施了方法，并在知识渊博的数据集上对其进行了评估，以提供实验数据，显示了我们方法在结果质量和运行时的有效性。

State machines are popular models to model and visualize discrete systems such as software systems, and to represent regular grammars. Most algorithms that passively learn state machines from data assume all the data to be available from the beginning and they load this data into memory. This makes it hard to apply them to continuously streaming data and results in large memory requirements when dealing with large datasets. In this paper we propose a method to learn state machines from data streams using the count-min-sketch data structure to reduce memory requirements. We apply state merging using the well-known red-blue-framework to reduce the search space. We implemented our approach in an established framework for learning state machines, and evaluated it on a well know dataset to provide experimental data, showing the effectiveness of our approach with respect to quality of the results and run-time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题