论文标题
搜索新物理时的极端数据压缩
Extreme data compression while searching for new physics
论文作者
论文摘要
将高维数据集带入科学准备就绪的形状是一个巨大的挑战,通常需要数据压缩。因此,压缩已成为当代宇宙学的关键考虑,影响公共数据发布,并重新搜索新物理学。但是,针对特定模型优化的数据压缩可以抑制新物理的迹象,甚至可以完全删除它们。因此,我们提供了一种解决新物理\ emph {期间}数据压缩的解决方案。特别是,我们存储其他不可知论的压缩数据点,以稍后的日期选择非标准物理的精确约束。我们的过程基于摩托车算法的最大压缩,该算法最佳地过滤了基线模型的数据。我们基于广义主成分分析选择其他过滤器,这些过滤器是仔细构造的,以高精度和速度搜索新物理。我们将增强的过滤器集称为拖把PC。它们可以对贝叶斯证据进行分析计算,这些证据可能表明新物理学的存在,并在采用特定的非标准理论时对最佳拟合参数的快速分析估计,而无需进一步昂贵的MCMC分析。由于可能存在大量的非标准理论,因此该方法的速度变得必不可少。如果找不到新物理学,那么我们的方法保留了标准参数的精度。结果,我们实现了标准物理和非标准物理学的非常快速,最高精确的限制,该技术可以很好地扩展到大维数据集。
Bringing a high-dimensional dataset into science-ready shape is a formidable challenge that often necessitates data compression. Compression has accordingly become a key consideration for contemporary cosmology, affecting public data releases, and reanalyses searching for new physics. However, data compression optimized for a particular model can suppress signs of new physics, or even remove them altogether. We therefore provide a solution for exploring new physics \emph{during} data compression. In particular, we store additional agnostic compressed data points, selected to enable precise constraints of non-standard physics at a later date. Our procedure is based on the maximal compression of the MOPED algorithm, which optimally filters the data with respect to a baseline model. We select additional filters, based on a generalised principal component analysis, which are carefully constructed to scout for new physics at high precision and speed. We refer to the augmented set of filters as MOPED-PC. They enable an analytic computation of Bayesian evidences that may indicate the presence of new physics, and fast analytic estimates of best-fitting parameters when adopting a specific non-standard theory, without further expensive MCMC analysis. As there may be large numbers of non-standard theories, the speed of the method becomes essential. Should no new physics be found, then our approach preserves the precision of the standard parameters. As a result, we achieve very rapid and maximally precise constraints of standard and non-standard physics, with a technique that scales well to large dimensional datasets.