论文标题
InsertionNet 2.0:使用多模式多视感输入的最小接触多步插入
InsertionNet 2.0: Minimal Contact Multi-Step Insertion Using Multimodal Multiview Sensory Input
论文作者
论文摘要
我们解决了一个问题的问题,即仅通过一些人为干预,而没有手工制作的奖励或示范,就可以使机器人快速安全地学习插入技能。我们的InsertionNet 2.0版提供了一种改进的技术,可以鲁棒地应对各种用例,具有不同的形状,颜色,初始姿势等。特别是,我们提出了一种基于回归的方法,基于立体感知和力量的多模式输入,并增强了对比度学习的多模态输入,以实现有效学习有价值的功能的对比学习。此外,我们还引入了一种插入的单次学习技术,该技术依赖于关系网络方案,以更好地利用收集的数据并支持多步插入任务。我们的方法改进了原始插入网络获得的结果,在16项现实生活中的插入任务中获得了几乎完美的分数(在200次试验中的97.5 $ \%$ $ \%$),同时最小化执行时间和插入过程中的联系。我们进一步证明了我们的方法能够应对现实生活中的三步插入任务,并在没有学习的情况下完美解决看不见的插入任务。
We address the problem of devising the means for a robot to rapidly and safely learn insertion skills with just a few human interventions and without hand-crafted rewards or demonstrations. Our InsertionNet version 2.0 provides an improved technique to robustly cope with a wide range of use-cases featuring different shapes, colors, initial poses, etc. In particular, we present a regression-based method based on multimodal input from stereo perception and force, augmented with contrastive learning for the efficient learning of valuable features. In addition, we introduce a one-shot learning technique for insertion, which relies on a relation network scheme to better exploit the collected data and to support multi-step insertion tasks. Our method improves on the results obtained with the original InsertionNet, achieving an almost perfect score (above 97.5$\%$ on 200 trials) in 16 real-life insertion tasks while minimizing the execution time and contact during insertion. We further demonstrate our method's ability to tackle a real-life 3-step insertion task and perfectly solve an unseen insertion task without learning.