基于多视觉感知和3D多对象跟踪的温室中所有水果的自动定位和重构的开发和评估

论文标题

基于多视觉感知和3D多对象跟踪的温室中所有水果的自动定位和重构的开发和评估

Development and evaluation of automated localisation and reconstruction of all fruits on tomato plants in a greenhouse based on multi-view perception and 3D multi-object tracking

论文作者

Rincon, David Rapado, van Henten, Eldert J., Kootstra, Gert

论文摘要

准确表示和本地化相关对象的能力对于机器人有效执行任务至关重要。传统方法，机器人只是捕获图像，处理该图像以采取行动，然后忘记信息，事实证明在遮挡面前挣扎。使用多视图感知的方法，有可能解决这些问题的某些问题，需要一种世界模型，该模型可以指导从多个角度来指导信息的收集，集成和提取信息。此外，构建可以在各种环境和任务中应用的通用表示形式是一个困难的挑战。在本文中，引入了一种新颖的方法，用于使用多视觉感知和3D多对象跟踪在遮挡的农业食品环境中构建通用表示。该方法基于一种检测算法，该算法为每个检测到的对象生成部分点云，然后是一个3D多对象跟踪算法，该算法会随着时间的推移更新表示形式。在实际环境中评估了代表的准确性，尽管闭塞水平很高，但在番茄植物中成功代表和定位了西红柿的定位，估计西红柿的总数为5.08％，最大误差为5.08％，并且对番茄的准确性最高为71.47％。引入了新颖的跟踪指标，表明可以通过使用它们来提供对本地化和代表水果的错误的宝贵见解。这种方法提出了一种新的解决方案，用于在封闭的农业食品环境中构建表示形式，证明了使机器人能够在这些具有挑战性的环境中有效执行任务的潜力。

The ability to accurately represent and localise relevant objects is essential for robots to carry out tasks effectively. Traditional approaches, where robots simply capture an image, process that image to take an action, and then forget the information, have proven to struggle in the presence of occlusions. Methods using multi-view perception, which have the potential to address some of these problems, require a world model that guides the collection, integration and extraction of information from multiple viewpoints. Furthermore, constructing a generic representation that can be applied in various environments and tasks is a difficult challenge. In this paper, a novel approach for building generic representations in occluded agro-food environments using multi-view perception and 3D multi-object tracking is introduced. The method is based on a detection algorithm that generates partial point clouds for each detected object, followed by a 3D multi-object tracking algorithm that updates the representation over time. The accuracy of the representation was evaluated in a real-world environment, where successful representation and localisation of tomatoes in tomato plants were achieved, despite high levels of occlusion, with the total count of tomatoes estimated with a maximum error of 5.08% and the tomatoes tracked with an accuracy up to 71.47%. Novel tracking metrics were introduced, demonstrating that valuable insight into the errors in localising and representing the fruits can be provided by their use. This approach presents a novel solution for building representations in occluded agro-food environments, demonstrating potential to enable robots to perform tasks effectively in these challenging environments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题