使用Plackett-luce模型通过列表排名进行单眼深度估计

论文标题

使用Plackett-luce模型通过列表排名进行单眼深度估计

Monocular Depth Estimation via Listwise Ranking using the Plackett-Luce Model

论文作者

Lienen, Julian, Hüllermeier, Eyke, Ewerth, Ralph, Nommensen, Nils

论文摘要

在许多实际应用中，图像中对象的相对深度对于场景理解至关重要。最近的方法主要通过将问题视为回归任务来解决单眼图像中深度预测的问题。然而，首先，对订单关系感兴趣，排名的方法表明自己是回归的自然替代方法，实际上，将成对比较作为培训信息（“对象A更接近摄像机）比B更接近b”的排名方法在此问题上表现出了有希望的表现。在本文中，我们详细阐述了所谓的列表排名作为成对方法的概括。我们的方法基于Plackett-luce（PL）模型，排名上的概率分布，我们将其与最先进的神经网络结构结合使用，以及一种简单的抽样策略来降低训练复杂性。此外，利用PL作为随机效用模型的表示，提出的预测指标提供了一种自然的方法来恢复（换档）度量深度信息，从培训时提供的仅排名数据。与现有的排名和回归方法相比，在“零射”设置中对几个基准数据集的经验评估证明了我们方法的有效性。

In many real-world applications, the relative depth of objects in an image is crucial for scene understanding. Recent approaches mainly tackle the problem of depth prediction in monocular images by treating the problem as a regression task. Yet, being interested in an order relation in the first place, ranking methods suggest themselves as a natural alternative to regression, and indeed, ranking approaches leveraging pairwise comparisons as training information ("object A is closer to the camera than B") have shown promising performance on this problem. In this paper, we elaborate on the use of so-called listwise ranking as a generalization of the pairwise approach. Our method is based on the Plackett-Luce (PL) model, a probability distribution on rankings, which we combine with a state-of-the-art neural network architecture and a simple sampling strategy to reduce training complexity. Moreover, taking advantage of the representation of PL as a random utility model, the proposed predictor offers a natural way to recover (shift-invariant) metric depth information from ranking-only data provided at training time. An empirical evaluation on several benchmark datasets in a "zero-shot" setting demonstrates the effectiveness of our approach compared to existing ranking and regression methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题