Salsum：使用生成对抗网络的基于显着的视频摘要

论文标题

Salsum：使用生成对抗网络的基于显着的视频摘要

SalSum: Saliency-based Video Summarization using Generative Adversarial Networks

论文作者

Pantazis, George, Dimas, George, Iakovidis, Dimitris K.

论文摘要

每天由基于摄像机的系统（例如监视，医疗和电信系统）生成的大量视频数据出现了有效的视频摘要（VS）方法。这些方法应该能够创建视频内容的概述。在本文中，我们提出了一种基于通过人眼固定预先训练的生成对抗网络（GAN）模型的新型与方法。该方法的主要贡献是，它可以通过将感知的颜色和时空视觉注意线索组合在无监督的方案中来提供感知兼容的视频摘要。在不确定性和个性化下，考虑了几种融合方法。与基准数据集VSUMM上的最新方法与最新方法相比，评估了所提出的方法。实验结果得出的结论是，Salsum通过在VSUMM基准上提供最高的F量评分来优于最先进的方法。

The huge amount of video data produced daily by camera-based systems, such as surveilance, medical and telecommunication systems, emerges the need for effective video summarization (VS) methods. These methods should be capable of creating an overview of the video content. In this paper, we propose a novel VS method based on a Generative Adversarial Network (GAN) model pre-trained with human eye fixations. The main contribution of the proposed method is that it can provide perceptually compatible video summaries by combining both perceived color and spatiotemporal visual attention cues in a unsupervised scheme. Several fusion approaches are considered for robustness under uncertainty, and personalization. The proposed method is evaluated in comparison to state-of-the-art VS approaches on the benchmark dataset VSUMM. The experimental results conclude that SalSum outperforms the state-of-the-art approaches by providing the highest f-measure score on the VSUMM benchmark.

下载PDF全文

下载文献需遵守相关版权规定

论文标题