通过交叉模式跨域知识转移无监督的尖峰深度估计

论文标题

通过交叉模式跨域知识转移无监督的尖峰深度估计

Unsupervised Spike Depth Estimation via Cross-modality Cross-domain Knowledge Transfer

论文作者

Liu, Jiaming, Zhang, Qizhe, Li, Xiaoqi, Li, Jianing, Wang, Guanqun, Lu, Ming, Huang, Tiejun, Zhang, Shanghang

论文摘要

神经形态尖峰数据是即将到来的高时间分辨率的一种方式，通过缓解高速运动模糊带来的挑战，在自主驾驶中显示出了有希望的潜力。但是，培训尖峰深度估计网络在两个方面面临着重大挑战：用于像素的任务的稀疏空间信息以及在实现时间密集的尖峰流的配对深度标签方面的困难。因此，我们引入了开源RGB数据，以支持尖峰深度估计，利用其注释和空间信息。模式和数据分布的固有差异使直接从开源RGB进行转移学习到目标尖峰数据变得具有挑战性。为此，我们提出了一个跨模式跨域（BICROSS）框架，以通过引入模拟的中介源尖峰数据来实现无监督的尖峰深度估计。具体而言，我们设计了一种粗到最新的知识蒸馏（CFKD）方法，以促进全面的跨模式知识转移，同时利用面向尖峰的不确定性方案来保留两种模态的独特优势。然后，我们提出了一种自我纠正的教师学生（SCT）机制，以筛选出可靠的像素伪标签并简化学生模型的域移位，从而避免了目标尖峰数据中的错误积累。为了验证双方的有效性，我们在四种情况下进行了广泛的实验，包括合成至真实的，极端的天气，场景变化和真实的尖峰。与面向RGB的无监督深度估计方法相比，我们的方法实现了最新的（SOTA）性能。代码和数据集：https：//github.com/theia-4869/bicross

Neuromorphic spike data, an upcoming modality with high temporal resolution, has shown promising potential in autonomous driving by mitigating the challenges posed by high-velocity motion blur. However, training the spike depth estimation network holds significant challenges in two aspects: sparse spatial information for pixel-wise tasks and difficulties in achieving paired depth labels for temporally intensive spike streams. Therefore, we introduce open-source RGB data to support spike depth estimation, leveraging its annotations and spatial information. The inherent differences in modalities and data distribution make it challenging to directly apply transfer learning from open-source RGB to target spike data. To this end, we propose a cross-modality cross-domain (BiCross) framework to realize unsupervised spike depth estimation by introducing simulated mediate source spike data. Specifically, we design a Coarse-to-Fine Knowledge Distillation (CFKD) approach to facilitate comprehensive cross-modality knowledge transfer while preserving the unique strengths of both modalities, utilizing a spike-oriented uncertainty scheme. Then, we propose a Self-Correcting Teacher-Student (SCTS) mechanism to screen out reliable pixel-wise pseudo labels and ease the domain shift of the student model, which avoids error accumulation in target spike data. To verify the effectiveness of BiCross, we conduct extensive experiments on four scenarios, including Synthetic to Real, Extreme Weather, Scene Changing, and Real Spike. Our method achieves state-of-the-art (SOTA) performances, compared with RGB-oriented unsupervised depth estimation methods. Code and dataset: https://github.com/Theia-4869/BiCross

下载PDF全文

下载文献需遵守相关版权规定

论文标题