论文标题
Muzero与VP9视频压缩中的自我竞争用于费率控制
MuZero with Self-competition for Rate Control in VP9 Video Compression
论文作者
论文摘要
随着娱乐,教育和业务越来越依赖在线视频,视频流的使用量显着上升。优化视频压缩有可能提高用户的内容访问和内容质量,并降低能源利用和总体成本。在本文中,我们介绍了Muzero算法在视频压缩挑战中的应用。具体而言,我们针对学习费率控制策略的问题,以在LIBVPX的编码过程中选择量化参数(QP),这是一个开源VP9视频压缩库,该视频压缩库被广泛的视频按需(VOD)服务广泛使用。我们将其视为一个顺序决策问题,可以通过目标比特率施加的情节约束来最大化视频质量。值得注意的是,我们引入了一种新型的基于自我竞争的奖励机制,以求解具有可变约束满意度难度的约束RL,这对于现有的约束RL方法而言是挑战。我们证明,与LIBVPX的两次通用VBR速率控制策略相比,基于Muzero的速率控制平均减少了2.28%的压缩视频尺寸(以PSNR BD率测量),同时具有更好的约束满意度。
Video streaming usage has seen a significant rise as entertainment, education, and business increasingly rely on online video. Optimizing video compression has the potential to increase access and quality of content to users, and reduce energy use and costs overall. In this paper, we present an application of the MuZero algorithm to the challenge of video compression. Specifically, we target the problem of learning a rate control policy to select the quantization parameters (QP) in the encoding process of libvpx, an open source VP9 video compression library widely used by popular video-on-demand (VOD) services. We treat this as a sequential decision making problem to maximize the video quality with an episodic constraint imposed by the target bitrate. Notably, we introduce a novel self-competition based reward mechanism to solve constrained RL with variable constraint satisfaction difficulty, which is challenging for existing constrained RL methods. We demonstrate that the MuZero-based rate control achieves an average 6.28% reduction in size of the compressed videos for the same delivered video quality level (measured as PSNR BD-rate) compared to libvpx's two-pass VBR rate control policy, while having better constraint satisfaction behavior.