用于在线3D包装的广义增强学习算法

论文标题

用于在线3D包装的广义增强学习算法

A Generalized Reinforcement Learning Algorithm for Online 3D Bin-Packing

论文作者

Verma, Richa, Singhal, Aniruddha, Khadilkar, Harshad, Basumatary, Ansuma, Nayak, Siddharth, Singh, Harsh Vardhan, Kumar, Swagat, Sinha, Rajesh

论文摘要

我们提出了一种深入的增强学习（深RL）算法，用于解决在线3D垃圾箱包装问题，以解决任意数量的垃圾箱和任何垃圾箱的大小。重点是产生决策，该决策可以通过机器人装载臂进行物理实施，这是用于测试概念的实验室原型。本文考虑的问题是新颖的。首先，与传统的3D垃圾箱包装问题不同，我们假设要包装的整个对象集尚不清楚。取而代之的是，加载系统可以看到固定数量的即将到来的对象，并且必须按照到达顺序加载它们。其次，目标不是通过可行的路径将对象从一个点移到另一个点，而是要找到每个对象的位置和方向，以最大化垃圾箱的整体包装效率。最后，学习的模型旨在处理任意规模的问题实例而无需再培训。仿真结果表明，基于RL的方法在经验竞争比和批量效率方面优于最先进的在线垃圾箱启发式方法。

We propose a Deep Reinforcement Learning (Deep RL) algorithm for solving the online 3D bin packing problem for an arbitrary number of bins and any bin size. The focus is on producing decisions that can be physically implemented by a robotic loading arm, a laboratory prototype used for testing the concept. The problem considered in this paper is novel in two ways. First, unlike the traditional 3D bin packing problem, we assume that the entire set of objects to be packed is not known a priori. Instead, a fixed number of upcoming objects is visible to the loading system, and they must be loaded in the order of arrival. Second, the goal is not to move objects from one point to another via a feasible path, but to find a location and orientation for each object that maximises the overall packing efficiency of the bin(s). Finally, the learnt model is designed to work with problem instances of arbitrary size without retraining. Simulation results show that the RL-based method outperforms state-of-the-art online bin packing heuristics in terms of empirical competitive ratio and volume efficiency.

下载PDF全文

下载文献需遵守相关版权规定

论文标题