ATSAL：360个视频中基于注意力的显着性预测的体系结构

论文标题

ATSAL：360个视频中基于注意力的显着性预测的体系结构

ATSal: An Attention Based Architecture for Saliency Prediction in 360 Videos

论文作者

Dahou, Yasser, Tliba, Marouane, McGuinness, Kevin, O'Connor, Noel

论文摘要

360视频/图像的球形域表示表明了与全向视频的存储，处理，传输和渲染相关的许多挑战（ODV）。可以使用人类视觉关注的模型，以便一次仅渲染一个视口，这在开发允许用户可以使用头部安装显示器（HMD）探索ODV的系统时很重要。因此，研究人员为360个视频/图像提出了各种显着性模型。本文提出了ATSAL，这是360 \度视频的新型基于注意力的（头眼）显着性模型。注意机制明确编码了全球静态视觉注意力，从而使专家模型可以专注于在整个连续帧的本地斑块上学习显着性。我们将提出的方法与两个数据集上的其他最新显着性模型进行了比较：Salient360！和vr-eyetracking。超过80个ODV视频（75k+帧）的实验结果表明，所提出的方法的表现优于现有的最新方法。

The spherical domain representation of 360 video/image presents many challenges related to the storage, processing, transmission and rendering of omnidirectional videos (ODV). Models of human visual attention can be used so that only a single viewport is rendered at a time, which is important when developing systems that allow users to explore ODV with head mounted displays (HMD). Accordingly, researchers have proposed various saliency models for 360 video/images. This paper proposes ATSal, a novel attention based (head-eye) saliency model for 360\degree videos. The attention mechanism explicitly encodes global static visual attention allowing expert models to focus on learning the saliency on local patches throughout consecutive frames. We compare the proposed approach to other state-of-the-art saliency models on two datasets: Salient360! and VR-EyeTracking. Experimental results on over 80 ODV videos (75K+ frames) show that the proposed method outperforms the existing state-of-the-art.

下载PDF全文

下载文献需遵守相关版权规定

论文标题