论文标题
基于神经网络的CUSUM用于在线更改点检测
Neural network-based CUSUM for online change-point detection
论文作者
论文摘要
变更点检测,从顺序数据中检测到数据分布的突然变化是统计和机器学习中的一个基本问题。 Cusum是一种流行的统计方法,用于在线变更点检测,因为它递归计算和持续的内存要求效率,并且具有统计最佳性。 Cusum需要了解精确的变化前和变化后分布。但是,变换后分布通常是未知的先验性,因为它代表了异常和新颖性。当模型不匹配实际数据时,经典库司的性能会很差。虽然基于可能性比率的方法遇到了高维数据面临的挑战,但神经网络已成为具有计算效率和可伸缩性的变更点检测的新兴工具。在本文中,我们引入了一个神经网络Cusum(NN-Cusum),以用于在线更改点检测。当训练有素的神经网络可以执行变更点检测以及哪些损失可以实现我们的目标时,我们还提出了一种一般的理论条件。我们通过将分析与神经切线内核理论相结合,以建立标准性能指标的学习保证,包括平均运行长度(ARL)和预期检测延迟(EDD)。使用合成数据和现实世界数据在高维数据中检测变化点时,证明了NN-Cusum的强劲性能。
Change-point detection, detecting an abrupt change in the data distribution from sequential data, is a fundamental problem in statistics and machine learning. CUSUM is a popular statistical method for online change-point detection due to its efficiency from recursive computation and constant memory requirement, and it enjoys statistical optimality. CUSUM requires knowing the precise pre- and post-change distribution. However, post-change distribution is usually unknown a priori since it represents anomaly and novelty. Classic CUSUM can perform poorly when there is a model mismatch with actual data. While likelihood ratio-based methods encounter challenges facing high dimensional data, neural networks have become an emerging tool for change-point detection with computational efficiency and scalability. In this paper, we introduce a neural network CUSUM (NN-CUSUM) for online change-point detection. We also present a general theoretical condition when the trained neural networks can perform change-point detection and what losses can achieve our goal. We further extend our analysis by combining it with the Neural Tangent Kernel theory to establish learning guarantees for the standard performance metrics, including the average run length (ARL) and expected detection delay (EDD). The strong performance of NN-CUSUM is demonstrated in detecting change-point in high-dimensional data using both synthetic and real-world data.