从调制的角度来看，学到的图像压缩中的转换

论文标题

从调制的角度来看，学到的图像压缩中的转换

Transformations in Learned Image Compression from a Modulation Perspective

论文作者

Bao, Youneng, Meng, Fangyang, Tan, Wen, Li, Chao, Tian, Yonghong, Liang, Yongsheng

论文摘要

在本文中，从调制的角度提出了学习图像压缩（LIC）中一种统一的转换方法。首先，LIC中的量化被认为是具有添加剂均匀噪声的广义通道。此外，根据结构和优化目标的一致性，将LIC解释为特定的通信系统。因此，可以应用通信系统的技术来指导LIC中模块的设计。此外，定义了基于信号调制（TSM）的统一变换方法。在TSM的角度，现有的转换方法在数学上缩短为线性调制。一系列转化方法，例如TPM和TJM是通过扩展到非线性调制而获得的。各种数据集和骨干体系结构上的实验结果验证了所提出方法的有效性和鲁棒性。更重要的是，它进一步证实了从交流角度指导LIC设计的可行性。例如，当骨干架构是高优势组合上下文模型时，我们的方法实现了3.52 $ \％$ $ $ bd-rate比GDN在Kodak数据集上的减少，而不会增加复杂性。

In this paper, a unified transformation method in learned image compression(LIC) is proposed from the perspective of modulation. Firstly, the quantization in LIC is considered as a generalized channel with additive uniform noise. Moreover, the LIC is interpreted as a particular communication system according to the consistency in structures and optimization objectives. Thus, the technology of communication systems can be applied to guide the design of modules in LIC. Furthermore, a unified transform method based on signal modulation (TSM) is defined. In the view of TSM, the existing transformation methods are mathematically reduced to a linear modulation. A series of transformation methods, e.g. TPM and TJM, are obtained by extending to nonlinear modulation. The experimental results on various datasets and backbone architectures verify that the effectiveness and robustness of the proposed method. More importantly, it further confirms the feasibility of guiding LIC design from a communication perspective. For example, when backbone architecture is hyperprior combining context model, our method achieves 3.52$\%$ BD-rate reduction over GDN on Kodak dataset without increasing complexity.

下载PDF全文

下载文献需遵守相关版权规定

论文标题