论文标题

深入学习深度时间序列的异常检测模型,并具有受污染的训练数据

Robust Learning of Deep Time Series Anomaly Detection Models with Contaminated Training Data

论文作者

Li, Wenkai, Feng, Cheng, Chen, Ting, Zhu, Jun

论文摘要

时间序列异常检测(TSAD)是一项重要的数据挖掘任务,在物联网时代,许多应用程序。近年来,已经提出了大量基于神经网络的方法,与传统方法相比,在解决各个领域的TSAD问题方面的性能要好得多。然而,这些深层TSAD方法通常依赖于没有被异常污染的干净训练数据集来学习基础动力学的“正常概况”。这项要求是并非繁琐的,因为实际上很难提供干净的数据集。此外,如果没有对其鲁棒性的认识,则盲目地应用具有潜在污染训练数据的深层TSAD方法可能会在检测阶段引起显着的性能降解。在这项工作中,为了应对这一重要挑战,我们首先使用受污染的培训数据研究常用的深层TSAD方法的鲁棒性,该方法在不保证无异常的训练数据时提供了应用这些方法的指南。此外,我们提出了一种模型不足的方法,该方法可以有效地改善具有潜在污染数据的主流深层TSAD模型的鲁棒性。实验结果表明,我们的方法可以始终防止或减轻广泛使用基准数据集上主流深层TSAD模型的性能下降。

Time series anomaly detection (TSAD) is an important data mining task with numerous applications in the IoT era. In recent years, a large number of deep neural network-based methods have been proposed, demonstrating significantly better performance than conventional methods on addressing challenging TSAD problems in a variety of areas. Nevertheless, these deep TSAD methods typically rely on a clean training dataset that is not polluted by anomalies to learn the "normal profile" of the underlying dynamics. This requirement is nontrivial since a clean dataset can hardly be provided in practice. Moreover, without the awareness of their robustness, blindly applying deep TSAD methods with potentially contaminated training data can possibly incur significant performance degradation in the detection phase. In this work, to tackle this important challenge, we firstly investigate the robustness of commonly used deep TSAD methods with contaminated training data which provides a guideline for applying these methods when the provided training data are not guaranteed to be anomaly-free. Furthermore, we propose a model-agnostic method which can effectively improve the robustness of learning mainstream deep TSAD models with potentially contaminated data. Experiment results show that our method can consistently prevent or mitigate performance degradation of mainstream deep TSAD models on widely used benchmark datasets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源