径向自动编码器可增强异常检测

论文标题

径向自动编码器可增强异常检测

Radial Autoencoders for Enhanced Anomaly Detection

论文作者

Augustin, Mihai-Cezar, Bonvin, Vivien, Houssou, Regis, Rappos, Efstratios, Robert-Nicoud, Stephan

论文摘要

在分类问题中，有监督的机器学习方法优于传统算法，这要归功于神经网络学习复杂模式的能力。但是，在诸如异常或欺诈检测之类的两类分类任务中，无监督的方法可以做得更好，因为它们的预测不仅限于先前学习的异常类型。异常检测的直观方法可以基于距离两个类别质量中心的距离。自动编码器虽然受过监督的训练，但也可以检测异常：考虑到正常点的质量中心，重建现在已有半径，最大的半径很可能表明异常点。当然，基于半径的分类已经是可能的，而无需插入自动编码器。在任何空间中，可以在某种程度上进行径向分类。为了胜过表现，我们继续进行数据的径向变形（即轴心的集中压缩或扩展）和自动编码器训练。使用数据中心的任何自动编码器都在这里为中心自动编码器（CAE）受洗。一种特殊的类型是CAE，训练有均匀压缩数据集，称为Centripetal AutoCododer（CPAE）。在这里与示意性人工数据集有关新概念的研究，派生的方法显示出一致的得分改进。但是在实际的银行数据数据上测试了我们的径向变形，仅凭大多数监督方法所预期的，仅凭caes还是表现更好。但是，在混合方法中，CAE可以与空间的径向变形相结合，从而提高其分类评分。我们预计，由于几何形状的现场检测，以中心的自动编码器将成为不可替代的对象，这要归功于它们自然地在几何算法上阻止几何算法以及其本地检测未知异常类型的能力。

In classification problems, supervised machine-learning methods outperform traditional algorithms, thanks to the ability of neural networks to learn complex patterns. However, in two-class classification tasks like anomaly or fraud detection, unsupervised methods could do even better, because their prediction is not limited to previously learned types of anomalies. An intuitive approach of anomaly detection can be based on the distances from the centers of mass of the two respective classes. Autoencoders, although trained without supervision, can also detect anomalies: considering the center of mass of the normal points, reconstructions have now radii, with largest radii most likely indicating anomalous points. Of course, radii-based classification were already possible without interposing an autoencoder. In any space, radial classification can be operated, to some extent. In order to outperform it, we proceed to radial deformations of data (i.e. centric compression or expansions of axes) and autoencoder training. Any autoencoder that makes use of a data center is here baptized a centric autoencoder (cAE). A special type is the cAE trained with a uniformly compressed dataset, named the centripetal autoencoder (cpAE). The new concept is studied here in relation with a schematic artificial dataset, and the derived methods show consistent score improvements. But tested on real banking data, our radial deformation supervised algorithms alone still perform better that cAEs, as expected from most supervised methods; nonetheless, in hybrid approaches, cAEs can be combined with a radial deformation of space, improving its classification score. We expect that centric autoencoders will become irreplaceable objects in anomaly live detection based on geometry, thanks to their ability to stem naturally on geometrical algorithms and to their native capability of detecting unknown anomaly types.

下载PDF全文

下载文献需遵守相关版权规定

论文标题