论文标题
SVAM:由自主水下机器人进行显着引导的视觉注意建模
SVAM: Saliency-guided Visual Attention Modeling by Autonomous Underwater Robots
论文作者
论文摘要
本文提出了一种整体方法,用于显着引导的视觉注意建模(SVAM),以供自主水下机器人使用。我们提出的名为SVAM-NET的模型在自然水下图像中集成了各种尺度和语义上的深层视觉特征,以进行有效的显着对象检测(SOD)。 SVAM-NET体系结构以独特的方式配置,以在网络的两个单独分支中共同容纳自下而上的学习,同时共享相同的编码层。我们沿着这些学习途径设计了专用的空间注意模块(SAM),以利用抽象的四个阶段的SOD的粗级和优质语义特征。自下而上的分支以快速速率执行粗糙但相当准确的显着性估计,而更深的自上而下分支结合了一个残留的修补模块(RRM),该模块(RRM)提供了对明显物体的细粒度定位。基准数据集上SVAM-NET的广泛绩效评估清楚地证明了其在水下草皮上的有效性。我们还通过多个海洋试验的数据来验证其概括性能,其中包括不同的水下场景和水体的测试图像,以及具有看不见的天然物体的图像。此外,我们分析了其对机器人部署的计算可行性,并在几种视觉注意建模的几种重要用例中证明了其实用性。
This paper presents a holistic approach to saliency-guided visual attention modeling (SVAM) for use by autonomous underwater robots. Our proposed model, named SVAM-Net, integrates deep visual features at various scales and semantics for effective salient object detection (SOD) in natural underwater images. The SVAM-Net architecture is configured in a unique way to jointly accommodate bottom-up and top-down learning within two separate branches of the network while sharing the same encoding layers. We design dedicated spatial attention modules (SAMs) along these learning pathways to exploit the coarse-level and fine-level semantic features for SOD at four stages of abstractions. The bottom-up branch performs a rough yet reasonably accurate saliency estimation at a fast rate, whereas the deeper top-down branch incorporates a residual refinement module (RRM) that provides fine-grained localization of the salient objects. Extensive performance evaluation of SVAM-Net on benchmark datasets clearly demonstrates its effectiveness for underwater SOD. We also validate its generalization performance by several ocean trials' data that include test images of diverse underwater scenes and waterbodies, and also images with unseen natural objects. Moreover, we analyze its computational feasibility for robotic deployments and demonstrate its utility in several important use cases of visual attention modeling.