论文标题
深度神经网络顶级标签器的可解释性的详细研究
A Detailed Study of Interpretability of Deep Neural Network based Top Taggers
论文作者
论文摘要
可解释AI(XAI)方法的最新发展使研究人员可以探索深神经网络(DNN)的内部工作,从而揭示了有关输入输出关系的关键信息,并意识到数据如何与机器学习模型联系在一起。在本文中,我们探讨了DNN模型的可解释性,旨在识别大型强子对撞机(LHC)高能质子质子碰撞中的顶级Quark衰减的喷气机。我们回顾了现有顶级标签模型的子集,并探索了不同的定量方法,以识别哪些功能在识别顶级喷气机中起着最重要的作用。我们还研究了特征的重要性以及为什么在不同的XAI指标之间各不相同,特征之间的相关性如何影响其解释性以及潜在空间表示如何编码信息以及与物理有意义的数量相关。我们的研究发现了现有XAI方法的一些重大陷阱,并说明了如何克服它们以获得对这些模型的一致且有意义的解释。我们还将隐藏层作为神经激活模式(NAP)图的活性说明,并演示了它们如何用于了解DNNS如何在整个层之间中继信息以及这种理解如何通过允许有效的模型重新衡量和高参数调音来帮助使此类模型变得更加简单。这些研究不仅促进了解释模型的方法论方法,而且还揭示了有关这些模型所学的新见解。将这些观测值纳入增强模型设计中,我们提出了粒子流相互作用网络(PFIN)模型,并演示了可解释性启发的模型增强如何改善顶级标记性能。
Recent developments in the methods of explainable AI (XAI) allow researchers to explore the inner workings of deep neural networks (DNNs), revealing crucial information about input-output relationships and realizing how data connects with machine learning models. In this paper we explore interpretability of DNN models designed to identify jets coming from top quark decay in high energy proton-proton collisions at the Large Hadron Collider (LHC). We review a subset of existing top tagger models and explore different quantitative methods to identify which features play the most important roles in identifying the top jets. We also investigate how and why feature importance varies across different XAI metrics, how correlations among features impact their explainability, and how latent space representations encode information as well as correlate with physically meaningful quantities. Our studies uncover some major pitfalls of existing XAI methods and illustrate how they can be overcome to obtain consistent and meaningful interpretation of these models. We additionally illustrate the activity of hidden layers as Neural Activation Pattern (NAP) diagrams and demonstrate how they can be used to understand how DNNs relay information across the layers and how this understanding can help to make such models significantly simpler by allowing effective model reoptimization and hyperparameter tuning. These studies not only facilitate a methodological approach to interpreting models but also unveil new insights about what these models learn. Incorporating these observations into augmented model design, we propose the Particle Flow Interaction Network (PFIN) model and demonstrate how interpretability-inspired model augmentation can improve top tagging performance.