通过卷积网络转移学习以进行大气参数检索

论文标题

通过卷积网络转移学习以进行大气参数检索

Transfer Learning with Convolutional Networks for Atmospheric Parameter Retrieval

论文作者

Malmgren-Hansen, David, Nielsen, Allan Aasbjerg, Laparra, Valero, Valls, Gustau Camps-

论文摘要

Metop卫星系列的红外大气发声干涉仪（IASI）为数值天气预测（NWP）提供了重要的测量。从IASI提供的原始数据中检索准确的大气参数是一个巨大的挑战，但为了在NWP模型中使用数据所必需。统计模型性能受到损害，因为光谱维度极高，并且在整个大气柱上同时预测的变量数量很高。所有这些都为选择和研究最佳模型和处理方案带来了挑战。较早的工作显示了诸如内核方法和神经网络等非线性模型在此任务上表现良好，但是这两种方案在大量数据上都在计算上很重。内核方法随训练数据的数量不佳而定，并且神经网络需要设置关键的超参数。在这项工作中，我们遵循替代途径：我们研究卷积神经网（CNN S）中的转移学习，以通过从先前训练的模型中获得的相关变量获得的代理解决方案（功能或网络）来减轻再培训成本。我们展示了如何通过受过训练的CNN从IASI数据中提取的特征，以预测物理变量，可以用作另一种旨在预测低空下不同物理变量的统计方法的输入。此外，可以将学习的参数转移到另一个CNN模型中，并获得与仅需要微调的从头开始训练的CNN相当的结果。

The Infrared Atmospheric Sounding Interferometer (IASI) on board the MetOp satellite series provides important measurements for Numerical Weather Prediction (NWP). Retrieving accurate atmospheric parameters from the raw data provided by IASI is a large challenge, but necessary in order to use the data in NWP models. Statistical models performance is compromised because of the extremely high spectral dimensionality and the high number of variables to be predicted simultaneously across the atmospheric column. All this poses a challenge for selecting and studying optimal models and processing schemes. Earlier work has shown non-linear models such as kernel methods and neural networks perform well on this task, but both schemes are computationally heavy on large quantities of data. Kernel methods do not scale well with the number of training data, and neural networks require setting critical hyperparameters. In this work we follow an alternative pathway: we study transfer learning in convolutional neural nets (CNN s) to alleviate the retraining cost by departing from proxy solutions (either features or networks) obtained from previously trained models for related variables. We show how features extracted from the IASI data by a CNN trained to predict a physical variable can be used as inputs to another statistical method designed to predict a different physical variable at low altitude. In addition, the learned parameters can be transferred to another CNN model and obtain results equivalent to those obtained when using a CNN trained from scratch requiring only fine tuning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题