论文标题

在线变更检测指数窗口上的最大平均差异

Maximum Mean Discrepancy on Exponential Windows for Online Change Detection

论文作者

Kalinke, Florian, Heyden, Marco, Gntuni, Georg, Fouché, Edouard, Böhm, Klemens

论文摘要

在分析数据流并具有许多应用程序时,例如预测性维护,欺诈检测或医学,检测变化至关重要。一种检测变化的原则方法是通过假设检验将流中观测值的分布相互比较。最大平均差异(MMD)是概率分布空间上的(半)度量,可在内核添加域上提供强大的非参数两样本测试。特别是,MMD能够在轻度条件下检测分布之间的任何差异。但是,经典的MMD估计值遭受了二次运行时复杂性,这使其直接用于数据流中的更改检测。在本文中,我们提出了一种新的更改检测算法,称为指数窗口(MMDEW)上的最大平均差异,该算法将MMD的好处与基于指数窗口的有效计算相结合。我们证明,MMDEW享受Polygarithmic的运行时和对数内存的复杂性,并从经验上表明,它在基准数据流上表现出色。

Detecting changes is of fundamental importance when analyzing data streams and has many applications, e.g., in predictive maintenance, fraud detection, or medicine. A principled approach to detect changes is to compare the distributions of observations within the stream to each other via hypothesis testing. Maximum mean discrepancy (MMD), a (semi-)metric on the space of probability distributions, provides powerful non-parametric two-sample tests on kernel-enriched domains. In particular, MMD is able to detect any disparity between distributions under mild conditions. However, classical MMD estimators suffer from a quadratic runtime complexity, which renders their direct use for change detection in data streams impractical. In this article, we propose a new change detection algorithm, called Maximum Mean Discrepancy on Exponential Windows (MMDEW), that combines the benefits of MMD with an efficient computation based on exponential windows. We prove that MMDEW enjoys polylogarithmic runtime and logarithmic memory complexity and show empirically that it outperforms the state of the art on benchmark data streams.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源