论文标题
学习基于上下文的非本地熵建模用于图像压缩
Learning Context-Based Non-local Entropy Modeling for Image Compression
论文作者
论文摘要
这些代码的熵通常用作最近学习的有损图像压缩方法的速率损失。对代码的概率分布的精确估计在性能中起着至关重要的作用。但是,现有的基于深度学习的熵建模方法通常假定潜在代码在统计上是独立的,或者取决于某些侧面信息或本地上下文,这在上下文中未能考虑到全局相似性,从而阻碍了准确的熵估计。为了解决这个问题,我们通过在上下文中采用全球相似性,为上下文建模提出了一个非本地操作。具体而言,我们首先引入代理相似性功能和空间掩码,以处理上下文建模中缺少的参考问题。然后,我们通过非本地关注块将本地和全局上下文结合在一起,并将其用于掩盖的卷积网络进行熵建模。熵模型进一步采用,作为联合利率延伸优化的速率损失,以指导分析变换的训练和在转换编码框架中的合成变换网络。考虑到转换的宽度对于训练低失真模型至关重要,我们最终在转换中产生一个U-NET块,以通过可管理的记忆消耗和时间复杂性来增加宽度。柯达和Tecnick数据集的实验证明了熵建模中提出的基于上下文的非本地注意块的优越性,而低失真压缩中的U-NET块针对现有的图像压缩标准和最近的深层图像压缩模型的优越性。
The entropy of the codes usually serves as the rate loss in the recent learned lossy image compression methods. Precise estimation of the probabilistic distribution of the codes plays a vital role in the performance. However, existing deep learning based entropy modeling methods generally assume the latent codes are statistically independent or depend on some side information or local context, which fails to take the global similarity within the context into account and thus hinder the accurate entropy estimation. To address this issue, we propose a non-local operation for context modeling by employing the global similarity within the context. Specifically, we first introduce the proxy similarity functions and spatial masks to handle the missing reference problem in context modeling. Then, we combine the local and the global context via a non-local attention block and employ it in masked convolutional networks for entropy modeling. The entropy model is further adopted as the rate loss in a joint rate-distortion optimization to guide the training of the analysis transform and the synthesis transform network in transforming coding framework. Considering that the width of the transforms is essential in training low distortion models, we finally produce a U-Net block in the transforms to increase the width with manageable memory consumption and time complexity. Experiments on Kodak and Tecnick datasets demonstrate the superiority of the proposed context-based non-local attention block in entropy modeling and the U-Net block in low distortion compression against the existing image compression standards and recent deep image compression models.