通过保留数据分发来学习基于零件的表示

论文标题

通过保留数据分发来学习基于零件的表示

Learning a Deep Part-based Representation by Preserving Data Distribution

论文作者

Qin, Anyong, Shang, Zhaowei, Tan, Zhuolin, Zhang, Taiping, Tang, Yuan Yan

论文摘要

无监督的维度降低是高维数据识别问题领域中常用的技术之一。将权重限制为非负值的深度自动编码器网络可以学习基于低维的数据的数据。另一方面，每个数据群集的固有结构可以通过内部样品的分布来描述。然后，人们希望学习一个新的低维表示，可以完美地保留嵌入在原始高维数据空间中的内在结构。在本文中，通过保留数据分布，可以学习一个基于零件的表示形式，而新颖的算法称为分布保存网络嵌入（DPNE）。在DPNE中，我们首先需要使用$ k $ neart的邻居内核密度估计来估算原始高维数据的分布，然后我们寻求尊重上述分布的部分表示。实际数据集的实验结果表明，在群集准确性和AMI方面，所提出的算法具有良好的性能。事实证明，原始数据中的歧管结构可以在低维特征空间中得到很好的保存。

Unsupervised dimensionality reduction is one of the commonly used techniques in the field of high dimensional data recognition problems. The deep autoencoder network which constrains the weights to be non-negative, can learn a low dimensional part-based representation of data. On the other hand, the inherent structure of the each data cluster can be described by the distribution of the intraclass samples. Then one hopes to learn a new low dimensional representation which can preserve the intrinsic structure embedded in the original high dimensional data space perfectly. In this paper, by preserving the data distribution, a deep part-based representation can be learned, and the novel algorithm is called Distribution Preserving Network Embedding (DPNE). In DPNE, we first need to estimate the distribution of the original high dimensional data using the $k$-nearest neighbor kernel density estimation, and then we seek a part-based representation which respects the above distribution. The experimental results on the real-world data sets show that the proposed algorithm has good performance in terms of cluster accuracy and AMI. It turns out that the manifold structure in the raw data can be well preserved in the low dimensional feature space.

下载PDF全文

下载文献需遵守相关版权规定

论文标题