DeepFormableTag：端到端的一代和识别可变形的基金标记

论文标题

DeepFormableTag：端到端的一代和识别可变形的基金标记

DeepFormableTag: End-to-end Generation and Recognition of Deformable Fiducial Markers

论文作者

Yaldiz, Mustafa B., Meuleman, Andreas, Jang, Hyeonjoong, Ha, Hyunho, Kim, Min H.

论文摘要

基准标记已广泛用于识别可以通过相机检测到的对象或嵌入式消息。主要是，现有的检测方法假设标记印刷在理想的平面表面上。由于光学/透视扭曲和运动模糊的各种成像伪像，标记通常无法识别。为了克服这些局限性，我们提出了一个新型可变形的基准标记系统，该系统由三个主要部分组成：首先，基准标记生成器会创建一组自由形式的颜色模式，以在独特的视觉代码中编码大量的大规模信息。其次，一个可区分的图像模拟器创建了具有变形标记的影像现实主义场景图像的训练数据集，并在优化过程中以可区分的方式进行了渲染。渲染的图像包括带有镜面反射，光学失真，散焦和运动模糊，颜色改变，成像噪声以及标记的形状变形的逼真的阴影。最后，训练有素的标记探测器寻求感兴趣的区域，并通过反变形转换同时识别多个标记模式。可变形的标记创建者和探测器网络以端到端的方式通过可微不足道的渲染器共同优化，从而使我们能够以高精度来稳健地识别出广泛的可变形标记。我们的可变形标记系统能够以〜29 fps成功地解码36位消息，并具有严重的形状变形。结果验证了我们的系统明显优于传统和数据驱动的标记方法。我们基于学习的标记系统打开了信托标记的新有趣应用，包括对人体进行成本效益的运动捕获，使用我们的基金标记阵列作为结构化的光模式进行主动的3D扫描以及在动态表面上虚拟对象的强大增强现实对象。

Fiducial markers have been broadly used to identify objects or embed messages that can be detected by a camera. Primarily, existing detection methods assume that markers are printed on ideally planar surfaces. Markers often fail to be recognized due to various imaging artifacts of optical/perspective distortion and motion blur. To overcome these limitations, we propose a novel deformable fiducial marker system that consists of three main parts: First, a fiducial marker generator creates a set of free-form color patterns to encode significantly large-scale information in unique visual codes. Second, a differentiable image simulator creates a training dataset of photorealistic scene images with the deformed markers, being rendered during optimization in a differentiable manner. The rendered images include realistic shading with specular reflection, optical distortion, defocus and motion blur, color alteration, imaging noise, and shape deformation of markers. Lastly, a trained marker detector seeks the regions of interest and recognizes multiple marker patterns simultaneously via inverse deformation transformation. The deformable marker creator and detector networks are jointly optimized via the differentiable photorealistic renderer in an end-to-end manner, allowing us to robustly recognize a wide range of deformable markers with high accuracy. Our deformable marker system is capable of decoding 36-bit messages successfully at ~29 fps with severe shape deformation. Results validate that our system significantly outperforms the traditional and data-driven marker methods. Our learning-based marker system opens up new interesting applications of fiducial markers, including cost-effective motion capture of the human body, active 3D scanning using our fiducial markers' array as structured light patterns, and robust augmented reality rendering of virtual objects on dynamic surfaces.

下载PDF全文

下载文献需遵守相关版权规定

论文标题