Minijpas调查：使用机器学习的星 - 盖拉克斯分类

论文标题

Minijpas调查：使用机器学习的星 - 盖拉克斯分类

The miniJPAS survey: star-galaxy classification using machine learning

论文作者

Baqui, P. O., Marra, V., Casarini, L., Angulo, R., Díaz-García, L. A., Hernández-Monteagudo, C., Lopes, P. A. A., López-Sanjuan, C., Muniesa, D., Placco, V. M., Quartin, M., Queiroz, C., Sobral, D., Solano, E., Tempel, E., Varela, J., Vílchez, J. M., Abramo, R., Alcaniz, J., Benitez, N., Bonoli, S., Carneiro, S., Cenarro, A. J., Cristóbal-Hornillos, D., de Amorim, A. L., de Oliveira, C. M., Dupke, R., Ederoclite, A., Delgado, R. M. González, Marín-Franch, A., Moles, M., Ramió, H. Vázquez, Sodré, L., Taylor, K.

论文摘要

未来的天体物理调查（例如J-PAS）将产生非常大的数据集，这将需要部署准确有效的机器学习（ML）方法。在这项工作中，我们分析了Minijpas调查，该调查观察到约1 deg2的宙斯盾场，其中有56个窄带过滤器和4个UGRI宽带过滤器。我们将Minijpas源分类为扩展（星系）和点状（例如恒星）对象的分类，这是随后的科学分析的必要步骤。我们旨在开发一个基于显式建模的传统工具的ML分类器。为了训练和测试分类器，我们使用SDSS和HSC-SSP数据交叉匹配Minijpas数据集。我们在两个交叉匹配的目录上训练并测试了6种不同的ML算法。作为ML算法的输入，我们使用60个过滤器的大小以及它们的误差，以及有或没有形态学参数。我们还为每个指向使用R检测带中的平均PSF。我们发现RF和ERT算法在所有情况下都表现最好。在分析15 <r <23.5的全幅度范围时，我们在仅使用光度信息时发现AUC = 0.957，使用光度和形态学信息时AUC = 0.986。关于特征重要性，当使用形态学参数时，FWHM是最重要的特征。仅使用光度信息时，我们观察到宽带不一定比窄带更重要，并且误差与测量一样重要。 ML算法可以与传统的恒星/星系分类器竞争，在较弱的大小上表现优于后者（r> 21）。为了在https://j-pas.org/datarelealeases中提供最佳的分类器，具有或没有形态的最佳分类器。

Future astrophysical surveys such as J-PAS will produce very large datasets, which will require the deployment of accurate and efficient Machine Learning (ML) methods. In this work, we analyze the miniJPAS survey, which observed about 1 deg2 of the AEGIS field with 56 narrow-band filters and 4 ugri broad-band filters. We discuss the classification of miniJPAS sources into extended (galaxies) and point-like (e.g. stars) objects, a necessary step for the subsequent scientific analyses. We aim at developing an ML classifier that is complementary to traditional tools based on explicit modeling. In order to train and test our classifiers, we crossmatched the miniJPAS dataset with SDSS and HSC-SSP data. We trained and tested 6 different ML algorithms on the two crossmatched catalogs. As input for the ML algorithms we use the magnitudes from the 60 filters together with their errors, with and without the morphological parameters. We also use the mean PSF in the r detection band for each pointing. We find that the RF and ERT algorithms perform best in all scenarios. When analyzing the full magnitude range of 15<r<23.5 we find AUC=0.957 with RF when using only photometric information, and AUC=0.986 with ERT when using photometric and morphological information. Regarding feature importance, when using morphological parameters, FWHM is the most important feature. When using photometric information only, we observe that broad bands are not necessarily more important than narrow bands, and errors are as important as the measurements. ML algorithms can compete with traditional star/galaxy classifiers, outperforming the latter at fainter magnitudes (r>21). We use our best classifiers, with and without morphology, in order to produce a value added catalog available at https://j-pas.org/datareleases .

下载PDF全文

下载文献需遵守相关版权规定

论文标题