论文标题
关于机器学习与双眼立体声之间的协同作用,以从图像中进行深度估算:调查
On the Synergies between Machine Learning and Binocular Stereo for Depth Estimation from Images: a Survey
论文作者
论文摘要
立体声匹配是计算机视觉中最长的问题之一,近40年的研究和研究。多年来,范式已从本地的像素级决策转变为各种形式的离散和连续优化,转变为基于数据驱动的,基于学习的方法。最近,直到几年前,机器学习的兴起和深度学习的快速扩散与新的激动人心的趋势和应用都无法想象。有趣的是,这两个世界之间的关系是双向的。在机器(尤其是深度)中,学习在立体声匹配方面提前了最先进的作用,而立体声本身可以使新的突破性方法(例如基于深网的自我监督的单眼深度估计)等新的突破性方法。在本文中,我们回顾了从单眼图像和双眼图像的基于学习的深度估算领域的最新研究,这些图像突出了协同作用,到目前为止取得的成功以及社区在不久的将来将面临的开放挑战。
Stereo matching is one of the longest-standing problems in computer vision with close to 40 years of studies and research. Throughout the years the paradigm has shifted from local, pixel-level decision to various forms of discrete and continuous optimization to data-driven, learning-based methods. Recently, the rise of machine learning and the rapid proliferation of deep learning enhanced stereo matching with new exciting trends and applications unthinkable until a few years ago. Interestingly, the relationship between these two worlds is two-way. While machine, and especially deep, learning advanced the state-of-the-art in stereo matching, stereo itself enabled new ground-breaking methodologies such as self-supervised monocular depth estimation based on deep networks. In this paper, we review recent research in the field of learning-based depth estimation from single and binocular images highlighting the synergies, the successes achieved so far and the open challenges the community is going to face in the immediate future.