神经面部视频压缩使用多个视图

论文标题

神经面部视频压缩使用多个视图

Neural Face Video Compression using Multiple Views

论文作者

Volokitin, Anna, Brugger, Stefan, Benlalah, Ali, Martin, Sebastian, Amberg, Brian, Tschannen, Michael

论文摘要

深层生成模型的最新进展导致了神经面部视频压缩编解码器的发展，这些编解码器的带宽比工程编解码器少。这些神经编解码器通过翘曲源框架并使用生成模型来补偿扭曲源框架中的缺陷来重建当前框架。因此，经纱是使用少量关键点而不是密集的流场编码和传输的，与传统的编解码器相比，这会导致大量节省。但是，通过仅依靠单个源框架，这些方法会导致重建不正确（例如，在转动头部时头部的一侧没有划分并必须合成）。在这里，我们旨在通过依靠多个源框架（面部的视图）并提出令人鼓舞的结果来解决这个问题。

Recent advances in deep generative models led to the development of neural face video compression codecs that use an order of magnitude less bandwidth than engineered codecs. These neural codecs reconstruct the current frame by warping a source frame and using a generative model to compensate for imperfections in the warped source frame. Thereby, the warp is encoded and transmitted using a small number of keypoints rather than a dense flow field, which leads to massive savings compared to traditional codecs. However, by relying on a single source frame only, these methods lead to inaccurate reconstructions (e.g. one side of the head becomes unoccluded when turning the head and has to be synthesized). Here, we aim to tackle this issue by relying on multiple source frames (views of the face) and present encouraging results.

下载PDF全文

下载文献需遵守相关版权规定

论文标题