邻里代表性抽样，以进行有效的端到端视频质量评估

论文标题

邻里代表性抽样，以进行有效的端到端视频质量评估

Neighbourhood Representative Sampling for Efficient End-to-end Video Quality Assessment

论文作者

Wu, Haoning, Chen, Chaofeng, Liao, Liang, Hou, Jingwen, Sun, Wenxiu, Yan, Qiong, Gu, Jinwei, Lin, Weisi

论文摘要

现实世界视频的分辨率增加提出了深度视频质量评估（VQA）的效率和准确性之间的困境。一方面，保持原始分辨率将导致不可接受的计算成本。另一方面，由于详细信息和内容的丢失，现有的实践（例如调整和裁剪）将改变原始视频的质量，因此对质量评估有害。通过从人类视觉系统和视觉编码理论中空间冗余的研究中获得的见解，我们观察到邻里周围的质量信息通常相似，激励我们研究VQA有效的质量敏感邻里代表方案。在这项工作中，我们提出了一个统一的方案，时空网格迷你群采样（ST-GM），以获取一种新型的样本，命名为片段。首先将全分辨率视频分为具有预设时空网格的迷你立方体，然后对时间对准的质量代表进行采样以组成用作VQA输入的片段。此外，我们设计了专门针对碎片的网络体系结构片段注意网络（FANET）。借助碎片和粉丝，提议的有效端到端的快速VQA和FastEVQA的性能明显优于所有VQA基准测试的现有方法，而与当前的最新最终预先展示相比，仅需要1/1612 flops。代码，模型和演示可在https://github.com/timothyhtimothy/fast-vqa-and-fastervqa上找到。

The increased resolution of real-world videos presents a dilemma between efficiency and accuracy for deep Video Quality Assessment (VQA). On the one hand, keeping the original resolution will lead to unacceptable computational costs. On the other hand, existing practices, such as resizing and cropping, will change the quality of original videos due to the loss of details and contents, and are therefore harmful to quality assessment. With the obtained insight from the study of spatial-temporal redundancy in the human visual system and visual coding theory, we observe that quality information around a neighbourhood is typically similar, motivating us to investigate an effective quality-sensitive neighbourhood representatives scheme for VQA. In this work, we propose a unified scheme, spatial-temporal grid mini-cube sampling (St-GMS) to get a novel type of sample, named fragments. Full-resolution videos are first divided into mini-cubes with preset spatial-temporal grids, then the temporal-aligned quality representatives are sampled to compose the fragments that serve as inputs for VQA. In addition, we design the Fragment Attention Network (FANet), a network architecture tailored specifically for fragments. With fragments and FANet, the proposed efficient end-to-end FAST-VQA and FasterVQA achieve significantly better performance than existing approaches on all VQA benchmarks while requiring only 1/1612 FLOPs compared to the current state-of-the-art. Codes, models and demos are available at https://github.com/timothyhtimothy/FAST-VQA-and-FasterVQA.

下载PDF全文

下载文献需遵守相关版权规定

论文标题