连续深度神经网络中的稀疏性

论文标题

连续深度神经网络中的稀疏性

Sparsity in Continuous-Depth Neural Networks

论文作者

Aliee, Hananeh, Richter, Till, Solonin, Mikhail, Ibarra, Ignacio, Theis, Fabian, Kilbertus, Niki

论文摘要

在准确恢复观察到的轨迹方面，神经普通微分方程（节点）已被证明在学习动力学系统方面已被证明是成功的。尽管已经提出了不同类型的稀疏性来改善鲁棒性，但未经观察到的数据以外的动力学系统的节点的概括属性却没有被逐渐消失。我们系统地研究体重和特征稀疏对预测以及识别潜在的动力学定律的影响。除了评估现有方法外，我们还提出了一种正规化技术，以稀疏“输入输出连接”并在训练过程中提取相关功能。此外，我们策划了由人类运动捕获和人类造血单细胞RNA-Seq数据组成的现实世界数据集，分别在预测和动态识别中分别分析了不同水平的分布（OOD）概括。我们对这些具有挑战性的基准测试的广泛经验评估表明，在存在噪声或不规则采样的情况下，体重稀疏性改善了概括。但是，它并不能阻止在推断的动力学中学习伪造的特征依赖性，从而使它们在干预下的预测或推断出真正的潜在动力学方面不切实际。取而代之的是，与未注册的节点相比，特征稀疏性确实可以帮助恢复稀疏的基地真相动力学。

Neural Ordinary Differential Equations (NODEs) have proven successful in learning dynamical systems in terms of accurately recovering the observed trajectories. While different types of sparsity have been proposed to improve robustness, the generalization properties of NODEs for dynamical systems beyond the observed data are underexplored. We systematically study the influence of weight and feature sparsity on forecasting as well as on identifying the underlying dynamical laws. Besides assessing existing methods, we propose a regularization technique to sparsify "input-output connections" and extract relevant features during training. Moreover, we curate real-world datasets consisting of human motion capture and human hematopoiesis single-cell RNA-seq data to realistically analyze different levels of out-of-distribution (OOD) generalization in forecasting and dynamics identification respectively. Our extensive empirical evaluation on these challenging benchmarks suggests that weight sparsity improves generalization in the presence of noise or irregular sampling. However, it does not prevent learning spurious feature dependencies in the inferred dynamics, rendering them impractical for predictions under interventions, or for inferring the true underlying dynamics. Instead, feature sparsity can indeed help with recovering sparse ground-truth dynamics compared to unregularized NODEs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题