工作记忆启发了带有变革性表示的层次视频分解

论文标题

工作记忆启发了带有变革性表示的层次视频分解

Working memory inspired hierarchical video decomposition with transformative representations

论文作者

Qin, Binjie, Mao, Haohao, Zhang, Ruipeng, Zhu, Yueqi, Ding, Song, Chen, Xu

论文摘要

视频分解对于从计算机视觉，机器学习和医学成像中的复杂背景中提取移动前景对象非常重要，例如，从X射线冠状动脉血管造影（XCA）的复杂而嘈杂的背景中提取移动对比度的容器。但是，视频分解中仍然存在动态背景，重叠的异质环境和复杂噪声引起的挑战。为了解决这些问题，这项研究是第一个在视频分解任务中引入灵活的视觉工作记忆模型，以提供可解释和高性能的层次深度体系结构，从视觉和认知神经科学的角度将感觉和控制层之间的变革性表示整合。具体而言，充当结构调查的传感器层的强大PCA展开网络将XCA分解为稀疏/低级别的结构化表示，以将移动的对比度填充的容器与嘈杂和复杂的背景分开。然后，带有反射模块的贴片复发性卷积LSTM网络在工作记忆中体现了控制层的非结构化随机表示，将空间分解的非局部贴剂反复投射到异构血管检索和干扰抑制的正交子空间中。该视频分解深度体系结构有效地恢复了强度的异质轮廓和对物体对复杂背景干扰的几何形状。实验表明，所提出的方法在精确运动对比度提取方面显着优于最先进的方法，具有出色的柔韧性和计算效率。

Video decomposition is very important to extract moving foreground objects from complex backgrounds in computer vision, machine learning, and medical imaging, e.g., extracting moving contrast-filled vessels from the complex and noisy backgrounds of X-ray coronary angiography (XCA). However, the challenges caused by dynamic backgrounds, overlapping heterogeneous environments and complex noises still exist in video decomposition. To solve these problems, this study is the first to introduce a flexible visual working memory model in video decomposition tasks to provide interpretable and high-performance hierarchical deep architecture, integrating the transformative representations between sensory and control layers from the perspective of visual and cognitive neuroscience. Specifically, robust PCA unrolling networks acting as a structure-regularized sensor layer decompose XCA into sparse/low-rank structured representations to separate moving contrast-filled vessels from noisy and complex backgrounds. Then, patch recurrent convolutional LSTM networks with a backprojection module embody unstructured random representations of the control layer in working memory, recurrently projecting spatiotemporally decomposed nonlocal patches into orthogonal subspaces for heterogeneous vessel retrieval and interference suppression. This video decomposition deep architecture effectively restores the heterogeneous profiles of intensity and the geometries of moving objects against the complex background interferences. Experiments show that the proposed method significantly outperforms state-of-the-art methods in accurate moving contrast-filled vessel extraction with excellent flexibility and computational efficiency.

下载PDF全文

下载文献需遵守相关版权规定

论文标题