论文标题
增强切成薄片的Wasserstein距离
Augmented Sliced Wasserstein Distances
论文作者
论文摘要
虽然理论上具有吸引力,但瓦斯汀距离在大规模机器学习问题上的应用受到了其过度的计算成本的阻碍。切成薄片的Wasserstein距离及其变体通过随机投影提高了计算效率,但是如果投影数量不足,它们的精度较低,因为大多数投影会导致值巨大的值。在这项工作中,我们提出了一个新的距离指标家族,称为增强切片的瓦斯坦距离(ASWD),该距离(ASWDS)是通过首先将样品映射到由神经网络参数化的较高维度的超曲面而构建的。它来自一个关键观察结果,即驻留在这些Hypersurfaces上的样品的(随机)线性投影将转化为原始样品空间中更灵活的非线性投影,因此它们可以捕获数据分布的复杂结构。我们表明,可以有效地通过梯度上升来优化Hypersurfaces。我们提供了ASWD是有效度量的条件,并表明这可以通过注射神经网络体系结构获得。数值结果表明,对于合成和现实世界中的问题,ASWD明显优于其他Wasserstein变体。
While theoretically appealing, the application of the Wasserstein distance to large-scale machine learning problems has been hampered by its prohibitive computational cost. The sliced Wasserstein distance and its variants improve the computational efficiency through the random projection, yet they suffer from low accuracy if the number of projections is not sufficiently large, because the majority of projections result in trivially small values. In this work, we propose a new family of distance metrics, called augmented sliced Wasserstein distances (ASWDs), constructed by first mapping samples to higher-dimensional hypersurfaces parameterized by neural networks. It is derived from a key observation that (random) linear projections of samples residing on these hypersurfaces would translate to much more flexible nonlinear projections in the original sample space, so they can capture complex structures of the data distribution. We show that the hypersurfaces can be optimized by gradient ascent efficiently. We provide the condition under which the ASWD is a valid metric and show that this can be obtained by an injective neural network architecture. Numerical results demonstrate that the ASWD significantly outperforms other Wasserstein variants for both synthetic and real-world problems.