实时的芯片上映射尺度连续信息的有效计算

论文标题

实时的芯片上映射尺度连续信息的有效计算

Efficient Computation of Map-scale Continuous Mutual Information on Chip in Real Time

论文作者

Gupta, Keshav, Li, Peter Zhi Xuan, Karaman, Sertac, Sze, Vivienne

论文摘要

勘探任务对于许多新兴的机器人应用至关重要，从搜索和救援到太空探索。探索的计划问题需要确定未来测量的最佳位置，以通过减少其总熵来增强地图的忠诚度。一项广泛研究的技术涉及计算当前地图和未来测量之间的相互信息（MI），并利用此MI指标来决定未来测量的位置。但是，计算合理大小的地图的MI速度很慢，饥饿，这是易于快速有效的机器人探索的瓶颈。在本文中，我们引入了一种用于MI计算的新硬件加速器架构，该架构具有低延迟，节能MI计算核心和优化的内存子系统，该系统提供了足够的带宽以使核心充分利用。核心采用交错来对抗递归算法，工作量平衡和数值近似来减少潜伏期和能耗。 We demonstrate this optimized architecture with a Field-Programmable Gate Array (FPGA) implementation, which can compute MI for all cells in an entire 201-by-201 occupancy grid ({\em e.g.}, representing a 20.1m-by-20.1m map at 0.1m resolution) in 1.55 ms while consuming 1.7 mJ of energy, thus finally rendering MI computation for the whole map real time and at a fraction of传统计算平台的能源成本。为了进行比较，与在NVIDIA GEFORCE GTX 980平台上运行的基线GPU实现相比，在Xilinx Zynq-7000平台上运行的FPGA实现速度更快两个数量级，并且每MI地图计算的能量少三个数量级。与同等算法的CPU实现相比，这些改进更为明显。

Exploration tasks are essential to many emerging robotics applications, ranging from search and rescue to space exploration. The planning problem for exploration requires determining the best locations for future measurements that will enhance the fidelity of the map, for example, by reducing its total entropy. A widely-studied technique involves computing the Mutual Information (MI) between the current map and future measurements, and utilizing this MI metric to decide the locations for future measurements. However, computing MI for reasonably-sized maps is slow and power hungry, which has been a bottleneck towards fast and efficient robotic exploration. In this paper, we introduce a new hardware accelerator architecture for MI computation that features a low-latency, energy-efficient MI compute core and an optimized memory subsystem that provides sufficient bandwidth to keep the cores fully utilized. The core employs interleaving to counter the recursive algorithm, and workload balancing and numerical approximations to reduce latency and energy consumption. We demonstrate this optimized architecture with a Field-Programmable Gate Array (FPGA) implementation, which can compute MI for all cells in an entire 201-by-201 occupancy grid ({\em e.g.}, representing a 20.1m-by-20.1m map at 0.1m resolution) in 1.55 ms while consuming 1.7 mJ of energy, thus finally rendering MI computation for the whole map real time and at a fraction of the energy cost of traditional compute platforms. For comparison, this particular FPGA implementation running on the Xilinx Zynq-7000 platform is two orders of magnitude faster and consumes three orders of magnitude less energy per MI map compute, when compared to a baseline GPU implementation running on an NVIDIA GeForce GTX 980 platform. The improvements are more pronounced when compared to CPU implementations of equivalent algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题