论文标题

信息曼陀罗:带群集的统计距离矩阵

Information Mandala: Statistical Distance Matrix with Clustering

论文作者

Lu, Xin

论文摘要

在机器学习中,观察特征是在公制空间中测量的,以获得其距离函数以进行优化。给定类似的特征在统计上足够作为人群,可以计算两个概率分布之间的统计距离,以进行更精确的学习。只要观察到的特征是多价值的,统计距离函数仍然有效。但是,由于其标量输出,它不能应用于表示特征元素之间的详细距离。为了解决此问题,本文将传统的统计距离扩展到矩阵形式,称为统计距离矩阵。在实验中,所提出的方法在对象识别任务中表现良好,并且清楚,直观地表示CIFAR数据集中的CAT和狗图像之间的差异,即使使用图像像素直接计算。通过使用统计距离矩阵的分层聚类,可以将图像像素分成几个群集,这些簇像曼陀罗模式一样在中心周围排列。带有聚类的统计距离矩阵(称为信息曼陀罗)超出了普通的显着性图,可以帮助理解卷积神经网络的基本原理。

In machine learning, observation features are measured in a metric space to obtain their distance function for optimization. Given similar features that are statistically sufficient as a population, a statistical distance between two probability distributions can be calculated for more precise learning. Provided the observed features are multi-valued, the statistical distance function is still efficient. However, due to its scalar output, it cannot be applied to represent detailed distances between feature elements. To resolve this problem, this paper extends the traditional statistical distance to a matrix form, called a statistical distance matrix. In experiments, the proposed approach performs well in object recognition tasks and clearly and intuitively represents the dissimilarities between cat and dog images in the CIFAR dataset, even when directly calculated using the image pixels. By using the hierarchical clustering of the statistical distance matrix, the image pixels can be separated into several clusters that are geometrically arranged around a center like a Mandala pattern. The statistical distance matrix with clustering, called the Information Mandala, is beyond ordinary saliency maps and can help to understand the basic principles of the convolution neural network.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源