论文标题

使用链式的多元插补来预测活性银核的红移

Using Multivariate Imputation by Chained Equations to Predict Redshifts of Active Galactic Nuclei

论文作者

Gibson, Spencer James, Narendra, Aditya, Dainotti, Maria Giovanna, Bogdan, Malgorzata, Pollo, Agniezska, Poliszczuk, Artem, Rinaldi, Enrico, Liodakis, Ioannis

论文摘要

主动银河核(AGN)的红移测量仍然是一项耗时且具有挑战性的任务,因为它需要后续的光谱观测和详细的分析。因此,存在替代红移估计技术的紧急要求。在过去的几年中,用于此目的的机器学习(ML)的使用一直在增长,这主要是由于大规模银河系调查的可用性。但是,由于观察到的错误,这些数据集的很大一部分通常缺少条目,因此无法使用ML回归应用程序的分数。在这项研究中,我们证明了链式方程式(小鼠)称为多元插补的插补技术的性能,该技术通过使用目录中的可用信息进行估算来纠正丢失数据条目的问题。我们使用Fermi-LAT第四数据发布目录(4LAC),并将24%的目录归为目录。随后,我们遵循Dainotti等人中描述的方法。 (2021)并创建一个ML模型,用于估计4LAC AGN的红移。我们提出的结果强调了小鼠插入技术对机器学习模型性能的积极影响并获得了红移估计的准确性。

Redshift measurement of active galactic nuclei (AGNs) remains a time-consuming and challenging task, as it requires follow up spectroscopic observations and detailed analysis. Hence, there exists an urgent requirement for alternative redshift estimation techniques. The use of machine learning (ML) for this purpose has been growing over the last few years, primarily due to the availability of large-scale galactic surveys. However, due to observational errors, a significant fraction of these data sets often have missing entries, rendering that fraction unusable for ML regression applications. In this study, we demonstrate the performance of an imputation technique called Multivariate Imputation by Chained Equations (MICE), which rectifies the issue of missing data entries by imputing them using the available information in the catalog. We use the Fermi-LAT Fourth Data Release Catalog (4LAC) and impute 24% of the catalog. Subsequently, we follow the methodology described in Dainotti et al. (2021) and create an ML model for estimating the redshift of 4LAC AGNs. We present results which highlight positive impact of MICE imputation technique on the machine learning models performance and obtained redshift estimation accuracy.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源