将正则自动编码器及其在无监督的异常检测中的应用

论文标题

将正则自动编码器及其在无监督的异常检测中的应用

Graph Regularized Autoencoder and its Application in Unsupervised Anomaly Detection

论文作者

Ahmed, Imtiaz, Galoppo, Travis, Hu, Xia, Ding, Yu

论文摘要

对于许多无监督的学习任务，包括异常检测和聚类，降低是至关重要的第一步。自动编码器是实现降低维度的流行机制。为了使降低降低有效地嵌入非线性低维歧管的高维数据，可以理解，应使用某种地理距离度量来区分数据样本。受大地距离近似器（例如ISOMAP）成功的启发，我们建议使用最小生成树（MST），一种基于图的算法，以近似局部邻域结构并在数据点之间生成结构提供距离。我们使用此基于MST的距离度量替换自动编码器的嵌入功能中的欧几里得距离度量标准，并开发一个新的图形正规化自动编码器，该图超过20个基准测试异常检测数据集，其表现优于多种替代方法。我们进一步将MST正常器纳入了两个生成对抗网络中，并发现使用MST正常器可改善两个生成对抗网络的异常检测性能。我们还在聚类应用程序中的两个数据集上测试了MST正规化自动编码器，并见证了其出色的性能。

Dimensionality reduction is a crucial first step for many unsupervised learning tasks including anomaly detection and clustering. Autoencoder is a popular mechanism to accomplish dimensionality reduction. In order to make dimensionality reduction effective for high-dimensional data embedding nonlinear low-dimensional manifold, it is understood that some sort of geodesic distance metric should be used to discriminate the data samples. Inspired by the success of geodesic distance approximators such as ISOMAP, we propose to use a minimum spanning tree (MST), a graph-based algorithm, to approximate the local neighborhood structure and generate structure-preserving distances among data points. We use this MST-based distance metric to replace the Euclidean distance metric in the embedding function of autoencoders and develop a new graph regularized autoencoder, which outperforms a wide range of alternative methods over 20 benchmark anomaly detection datasets. We further incorporate the MST regularizer into two generative adversarial networks and find that using the MST regularizer improves the performance of anomaly detection substantially for both generative adversarial networks. We also test our MST regularized autoencoder on two datasets in a clustering application and witness its superior performance as well.

下载PDF全文

下载文献需遵守相关版权规定

论文标题