具有无所不能功能学习的机器的图像编码

论文标题

具有无所不能功能学习的机器的图像编码

Image Coding for Machines with Omnipotent Feature Learning

论文作者

Feng, Ruoyu, Jin, Xin, Guo, Zongyu, Feng, Runsen, Gao, Yixin, He, Tianyu, Zhang, Zhizheng, Sun, Simeng, Chen, Zhibo

论文摘要

机器的图像编码（ICM）旨在压缩图像进行AI任务分析，而不是满足人类的看法。学习一种既是一般（用于AI任务）的特征，又是紧凑的（用于压缩）的功能至关重要。在本文中，我们试图通过学习通用功能，同时考虑压缩来开发ICM框架。我们将诸如无所不能功能和相应框架的功能命名为Omni-ICM。考虑到自我监督学习（SSL）提高了特征的概括，我们将其与压缩任务集成到OMNI-ICM框架中，以学习无所不能的功能。但是，在SSL中协调语义建模并在压缩中删除的冗余是不合时宜的，因此我们通过协作实例区分和熵最小化以自适应与AI任务相关的自适应删除信息（例如，某些纹理冗余）。与以前的特定任务解决方案不同，Omni-ICM可以直接基于学习的无所不能特征而无需联合培训或额外转换的AI任务分析。尽管简单而直观，但Omni-ICM在多个基本视觉任务上的现有传统和基于学习的解码器的表现大大优于现有的传统和基于学习的编解码。

Image Coding for Machines (ICM) aims to compress images for AI tasks analysis rather than meeting human perception. Learning a kind of feature that is both general (for AI tasks) and compact (for compression) is pivotal for its success. In this paper, we attempt to develop an ICM framework by learning universal features while also considering compression. We name such features as omnipotent features and the corresponding framework as Omni-ICM. Considering self-supervised learning (SSL) improves feature generalization, we integrate it with the compression task into the Omni-ICM framework to learn omnipotent features. However, it is non-trivial to coordinate semantics modeling in SSL and redundancy removing in compression, so we design a novel information filtering (IF) module between them by co-optimization of instance distinguishment and entropy minimization to adaptively drop information that is weakly related to AI tasks (e.g., some texture redundancy). Different from previous task-specific solutions, Omni-ICM could directly support AI tasks analysis based on the learned omnipotent features without joint training or extra transformation. Albeit simple and intuitive, Omni-ICM significantly outperforms existing traditional and learning-based codecs on multiple fundamental vision tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题