论文标题
蒸馏的低级神经辐射场和光场压缩的量化
Distilled Low Rank Neural Radiance Field with Quantization for Light Field Compression
论文作者
论文摘要
我们在本文中提出了一个量化的蒸馏式低级神经辐射场(QDLR-NERF)表示光场压缩的任务。当现有的压缩方法编码光场子孔径图像集时,我们提出的方法以神经辐射场(NERF)的形式学习了隐式场景表示,这也使视图综合。为了降低大小,首先在低级别(LR)约束下使用张量列(TT)分解在乘数(ADMM)优化框架的交替方向方法中学习该模型。为了进一步降低模型的大小,需要量化张量列车分解的组件。但是,同时考虑使用低级别约束和受速率约束权重量化的NERF模型的优化是具有挑战性的。为了解决这一难度,我们引入了一个网络蒸馏操作,该操作将低级近似值和网络训练期间的重量量化分开。基于LR-NERF的TT分解,将来自初始LR受限的NERF(LR-NERF)(LR-NERF)的信息蒸馏成一个较小尺寸(DLR-NERF)的模型。然后,我们学习一本优化的全局代码簿,以量化所有TT组件,从而产生最终的QDLR-NERF。实验结果表明,与最先进的方法相比,我们提出的方法具有更好的压缩效率,并且还具有允许允许具有高质量的任何光场视图的优势。
We propose in this paper a Quantized Distilled Low-Rank Neural Radiance Field (QDLR-NeRF) representation for the task of light field compression. While existing compression methods encode the set of light field sub-aperture images, our proposed method learns an implicit scene representation in the form of a Neural Radiance Field (NeRF), which also enables view synthesis. To reduce its size, the model is first learned under a Low-Rank (LR) constraint using a Tensor Train (TT) decomposition within an Alternating Direction Method of Multipliers (ADMM) optimization framework. To further reduce the model's size, the components of the tensor train decomposition need to be quantized. However, simultaneously considering the optimization of the NeRF model with both the low-rank constraint and rate-constrained weight quantization is challenging. To address this difficulty, we introduce a network distillation operation that separates the low-rank approximation and the weight quantization during network training. The information from the initial LR-constrained NeRF (LR-NeRF) is distilled into a model of much smaller dimension (DLR-NeRF) based on the TT decomposition of the LR-NeRF. We then learn an optimized global codebook to quantize all TT components, producing the final QDLR-NeRF. Experimental results show that our proposed method yields better compression efficiency compared to state-of-the-art methods, and it additionally has the advantage of allowing the synthesis of any light field view with high quality.