论文标题

卷积神经网络用于一维光谱数据的分类和回归分析

Convolutional neural networks for classification and regression analysis of one-dimensional spectral data

论文作者

Jernelv, Ine L., Hjelme, Dag Roar, Matsuura, Yuji, Aksnes, Astrid

论文摘要

卷积神经网络(CNN)被广泛用于图像识别和文本分析,并已建议在一维数据上应用,以减少预处理步骤的需求。预处理是多元分析的组成部分,但是由于大量可用方法,最佳预处理方法的确定可能是耗时的。在这项工作中,研究了CNN的性能,以进行光谱数据的分类和回归分析。将CNN与其他各种化学计量方法进行了比较,包括用于分类和部分最小二乘回归(PLSR)进行回归分析的载体机器(SVM)。对比较是在原始数据和经过预处理和/或特征选择方法的数据上进行的。这些模型用于使用基于近红外,中红外和拉曼光谱的方法获取的光谱数据。对于分类数据集,根据正确分类的观测值对模型进行了评估,而对于回归分析,模型是根据确定系数(r $^2 $)评估的。我们的结果表明,CNN可以胜过标准化学计量方法,尤其是对于不使用预处理的分类任务。但是,使用适当的预处理和特征选择方法时,CNN和标准化学计量方法都可以提高性能。这些结果证明了在一维数据上使用的CNN的一些功能和局限性。

Convolutional neural networks (CNNs) are widely used for image recognition and text analysis, and have been suggested for application on one-dimensional data as a way to reduce the need for pre-processing steps. Pre-processing is an integral part of multivariate analysis, but determination of the optimal pre-processing methods can be time-consuming due to the large number of available methods. In this work, the performance of a CNN was investigated for classification and regression analysis of spectral data. The CNN was compared with various other chemometric methods, including support vector machines (SVMs) for classification and partial least squares regression (PLSR) for regression analysis. The comparisons were made both on raw data, and on data that had gone through pre-processing and/or feature selection methods. The models were used on spectral data acquired with methods based on near-infrared, mid-infrared, and Raman spectroscopy. For the classification datasets the models were evaluated based on the percentage of correctly classified observations, while for regression analysis the models were assessed based on the coefficient of determination (R$^2$). Our results show that CNNs can outperform standard chemometric methods, especially for classification tasks where no pre-processing is used. However, both CNN and the standard chemometric methods see improved performance when proper pre-processing and feature selection methods are used. These results demonstrate some of the capabilities and limitations of CNNs used on one-dimensional data.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源