关于准仪的学习和学习性

论文标题

关于准仪的学习和学习性

On the Learning and Learnability of Quasimetrics

论文作者

Wang, Tongzhou, Isola, Phillip

论文摘要

我们的世界充满了不对称。重力和风可以使与回来更容易到达地方。诸如家谱图和引文图之类的社会文物固有地指向。在加强学习和控制中，最佳目标策略很少是可逆的（对称性）。这些不对称结构支持的距离函数称为准函数。尽管出现了共同的外观，但对学习的学习很少进行。我们的理论分析表明，一种通用的学习算法（包括无约束的多层感知者（MLP）），事实证明，无法学习与培训数据一致的准学。相比之下，我们提议的泊松准嵌入（PQE）是第一个准学表述，两者都可以通过基于梯度的优化来学习，并且具有强大的性能保证。随机图，社交图和离线Q学习的实验证明了其对许多常见基线的有效性。

Our world is full of asymmetries. Gravity and wind can make reaching a place easier than coming back. Social artifacts such as genealogy charts and citation graphs are inherently directed. In reinforcement learning and control, optimal goal-reaching strategies are rarely reversible (symmetrical). Distance functions supported on these asymmetrical structures are called quasimetrics. Despite their common appearance, little research has been done on the learning of quasimetrics. Our theoretical analysis reveals that a common class of learning algorithms, including unconstrained multilayer perceptrons (MLPs), provably fails to learn a quasimetric consistent with training data. In contrast, our proposed Poisson Quasimetric Embedding (PQE) is the first quasimetric learning formulation that both is learnable with gradient-based optimization and enjoys strong performance guarantees. Experiments on random graphs, social graphs, and offline Q-learning demonstrate its effectiveness over many common baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题