使用SIM2REAL域随机进行机器人应用程序检测

论文标题

使用SIM2REAL域随机进行机器人应用程序检测

Object Detection Using Sim2Real Domain Randomization for Robotic Applications

论文作者

Horváth, Dániel, Erdős, Gábor, Istenes, Zoltán, Horváth, Tomáš, Földi, Sándor

论文摘要

在非结构化环境中工作的机器人必须能够感知和解释其周围环境。机器人技术领域中基于深度学习的模型的主要障碍之一是缺乏针对不同工业应用的域特异性标记数据。在本文中，我们提出了一种基于域随机化的SIM2REAL传输学习方法，用于对象检测，该方法可以自动生成任意大小和对象类型的标记的合成数据集。随后，对最先进的卷积神经网络Yolov4进行了训练，以检测不同类型的工业对象。通过提出的域随机化方法，在我们手动注释的包含190个真实图像的手动注释的数据集上，我们可以将现实差距缩小到令人满意的水平，分别达到86.32％和97.38％的MAP50分数。我们的解决方案适合工业用途，因为数据生成过程的每个图像少于0.5 s，并且在GeForce RTX 2080 Ti GPU上，训练仅持续12小时。此外，它可以通过仅访问一个真实图像进行培训来可靠地区分相似的对象类别。据我们所知，这是迄今为止满足这些约束的唯一工作。

Robots working in unstructured environments must be capable of sensing and interpreting their surroundings. One of the main obstacles of deep-learning-based models in the field of robotics is the lack of domain-specific labeled data for different industrial applications. In this article, we propose a sim2real transfer learning method based on domain randomization for object detection with which labeled synthetic datasets of arbitrary size and object types can be automatically generated. Subsequently, a state-of-the-art convolutional neural network, YOLOv4, is trained to detect the different types of industrial objects. With the proposed domain randomization method, we could shrink the reality gap to a satisfactory level, achieving 86.32% and 97.38% mAP50 scores, respectively, in the case of zero-shot and one-shot transfers, on our manually annotated dataset containing 190 real images. Our solution fits for industrial use as the data generation process takes less than 0.5 s per image and the training lasts only around 12 h, on a GeForce RTX 2080 Ti GPU. Furthermore, it can reliably differentiate similar classes of objects by having access to only one real image for training. To our best knowledge, this is the only work thus far satisfying these constraints.

下载PDF全文

下载文献需遵守相关版权规定

论文标题