论文标题
通过在矢量量化潜在空间中的表示学习,半监督的抓取检测
Semi-supervised Grasp Detection by Representation Learning in a Vector Quantized Latent Space
论文作者
论文摘要
为了使机器人执行复杂的操作任务,必须具有良好的抓握能力。然而,基于视觉的机器人掌握检测受到足够标记数据的不可用而阻碍。此外,还没有探索半监督学习技术来掌握检测。在本文中,已经介绍了一种基于半监督学习的GRASP检测方法,该方法使用量化的变异自动编码器(VQ-VAE)建模了离散的潜在空间。据我们所知,这是第一次在机器人Grasp检测的领域中应用各变量自动编码器(VAE)。尽管通过使用未标记的数据,VAE可以帮助模型超越康奈尔抓地数据集(CGD)(CGD)。通过在CGD中无法可用的图像上测试模型,通过测试模型来验证此声明。随之而来的是,我们使用VQ-VAE模型中使用的解码器结构来扩大生成抓卷卷积神经网络(GGCNN)体系结构,其直觉应该有助于在矢量定量的潜在空间中进行回归。随后,该模型的性能明显优于现有方法,这些方法不利用未使用未标记的图像来改善掌握。
For a robot to perform complex manipulation tasks, it is necessary for it to have a good grasping ability. However, vision based robotic grasp detection is hindered by the unavailability of sufficient labelled data. Furthermore, the application of semi-supervised learning techniques to grasp detection is under-explored. In this paper, a semi-supervised learning based grasp detection approach has been presented, which models a discrete latent space using a Vector Quantized Variational AutoEncoder (VQ-VAE). To the best of our knowledge, this is the first time a Variational AutoEncoder (VAE) has been applied in the domain of robotic grasp detection. The VAE helps the model in generalizing beyond the Cornell Grasping Dataset (CGD) despite having a limited amount of labelled data by also utilizing the unlabelled data. This claim has been validated by testing the model on images, which are not available in the CGD. Along with this, we augment the Generative Grasping Convolutional Neural Network (GGCNN) architecture with the decoder structure used in the VQ-VAE model with the intuition that it should help to regress in the vector-quantized latent space. Subsequently, the model performs significantly better than the existing approaches which do not make use of unlabelled images to improve the grasp.