论文标题
学习因分布预测的因果语义表示
Learning Causal Semantic Representation for Out-of-Distribution Prediction
论文作者
论文摘要
发现传统的监督学习方法,尤其是深层学习方法对分布(OOD)示例敏感,主要是因为学到的表示形式将语义因子与域特异性相关性引起的变化因子混合在一起,而语义因素仅引起输出。为了解决该问题,我们提出了基于因果推理的因果语义生成模型(CSG),以便将两个因素分开建模,并从单个训练领域开发出OOD预测的方法,这是常见且具有挑战性的。这些方法基于因果不变原理,其在各种贝叶斯的新设计都具有高效学习和简单的预测。从理论上讲,我们证明在某些条件下,CSG可以通过拟合训练数据来识别语义因素,而这种语义识别可以保证OOD概括误差的界限和适应性的成功。经验研究表明,与现行基线相比,OOD的性能提高了。
Conventional supervised learning methods, especially deep ones, are found to be sensitive to out-of-distribution (OOD) examples, largely because the learned representation mixes the semantic factor with the variation factor due to their domain-specific correlation, while only the semantic factor causes the output. To address the problem, we propose a Causal Semantic Generative model (CSG) based on a causal reasoning so that the two factors are modeled separately, and develop methods for OOD prediction from a single training domain, which is common and challenging. The methods are based on the causal invariance principle, with a novel design in variational Bayes for both efficient learning and easy prediction. Theoretically, we prove that under certain conditions, CSG can identify the semantic factor by fitting training data, and this semantic-identification guarantees the boundedness of OOD generalization error and the success of adaptation. Empirical study shows improved OOD performance over prevailing baselines.