宇宙映射中的类星体，星系和星星的分类多模式深度学习

论文标题

宇宙映射中的类星体，星系和星星的分类多模式深度学习

Classification of Quasars, Galaxies, and Stars in the Mapping of the Universe Multi-modal Deep Learning

论文作者

Ethiraj, Sabeesh, Bolla, Bharath Kumar

论文摘要

在本文中，第四版Sloan Digital Sky Survey（SDSS-4），数据发布16数据集用于使用机器学习和深度学习体系结构将SDSS数据集分类为星系，星星和类星体。我们以表格格式有效地利用图像和元数据来构建一种新型的多模式结构并实现最新结果。此外，我们在五个不同的架构上使用成像网的重量转移学习的实验（Resnet-50，Densenet-121 VGG-16，Xception和Extricnet）表明，冻结所有层并添加最终训练层可能不是转移学习的最佳解决方案。假设可训练层的数量越高，训练时间和预测的准确性将更高。还可以假设，随后训练层的训练层数量随后增加的准确性都不会提高，因为预先训练的较低层仅有助于低级别的特征提取，这在所有数据集中都非常相似。因此，需要针对每个模型的参数数来确定可训练层的理想水平。对于表格数据，我们将经典的机器学习算法（逻辑回归，随机森林，决策树，Adaboost，LightGBM等）与人工神经网络进行了比较。我们的作品为转移学习和多模式深度学习体系结构提供了新的启示。多模式体系结构不仅比仅使用图像数据或表格数据的模型产生了更高的指标（准确性，精度，召回，F1得分）。此外，多模式架构在较小的培训时期实现了最佳指标，并改善了所有班级的指标。

In this paper, the fourth version the Sloan Digital Sky Survey (SDSS-4), Data Release 16 dataset was used to classify the SDSS dataset into galaxies, stars, and quasars using machine learning and deep learning architectures. We efficiently utilize both image and metadata in tabular format to build a novel multi-modal architecture and achieve state-of-the-art results. In addition, our experiments on transfer learning using Imagenet weights on five different architectures (Resnet-50, DenseNet-121 VGG-16, Xception, and EfficientNet) reveal that freezing all layers and adding a final trainable layer may not be an optimal solution for transfer learning. It is hypothesized that higher the number of trainable layers, higher will be the training time and accuracy of predictions. It is also hypothesized that any subsequent increase in the number of training layers towards the base layers will not increase in accuracy as the pre trained lower layers only help in low level feature extraction which would be quite similar in all the datasets. Hence the ideal level of trainable layers needs to be identified for each model in respect to the number of parameters. For the tabular data, we compared classical machine learning algorithms (Logistic Regression, Random Forest, Decision Trees, Adaboost, LightGBM etc.,) with artificial neural networks. Our works shed new light on transfer learning and multi-modal deep learning architectures. The multi-modal architecture not only resulted in higher metrics (accuracy, precision, recall, F1 score) than models using only image data or tabular data. Furthermore, multi-modal architecture achieved the best metrics in lesser training epochs and improved the metrics on all classes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题