论文标题
拉曼光谱和化学计量学的组合:对Spectrochimica Acta中发表的最新研究的评论,A部分:分子和生物分子光谱杂志
Combination of Raman spectroscopy and chemometrics: A review of recent studies published in the Spectrochimica Acta, Part A: Molecular and Biomolecular Spectroscopy Journal
论文作者
论文摘要
拉曼光谱法是一种有前途的技术,用于在各个应用领域的样品非侵入性分析,因为其在分子水平上对样品进行了指纹探测的能力。如今,化学测量方法已被广泛使用,以更好地理解样品的记录光谱指纹及其化学组成的差异。这篇综述考虑了在Spectrochimica Acta中发表的许多手稿,A部分:分子和生物分子光谱杂志,这些杂志介绍了有关拉曼光谱与化学计量学结合使用化学量表在研究样品及其由不同因素引起的变化的发现。在57个审查的手稿中,我们分析了化学计量学算法,统计建模参数,跨验证的利用,样本量以及所提出的分类和回归模型的性能的应用。我们总结了创建分类模型的最佳策略,并在化学计量技术的应用时突出了一些常见的缺点。根据我们的估计,由于对所提出的分类模型的使用方法或缺点,约有70%的论文可能包含不支持或无效的数据。这些缺点包括:(1)分类/回归的实验样本量不足以实现明显可靠的结果,(2)缺乏交叉验证(或测试集)来验证分类器/回归性能,(3)将光谱数据分为训练和测试/验证集中的光谱数据不正确; (4)不当选择PC数以减少分析的光谱数据维度。
Raman spectroscopy is a promising technique used for noninvasive analysis of samples in various fields of application due to its ability for fingerprint probing of samples at the molecular level. Chemometrics methods are widely used nowadays for better understanding of the recorded spectral fingerprints of samples and differences in their chemical composition. This review considers a number of manuscripts published in the Spectrochimica Acta, Part A: Molecular and Biomolecular Spectroscopy Journal that presented findings regarding the application of Raman spectroscopy in combination with chemometrics to study samples and their changes caused by different factors. In 57 reviewed manuscripts, we analyzed application of chemometrics algorithms, statistical modeling parameters, utilization of cross validation, sample sizes, as well as the performance of the proposed classification and regression model. We summarized the best strategies for creating classification models and highlighted some common drawbacks when it comes to the application of chemometrics techniques. According to our estimations, about 70% of the papers are likely to contain unsupported or invalid data due to insufficient description of the utilized methods or drawbacks of the proposed classification models. These drawbacks include: (1) insufficient experimental sample size for classification/regression to achieve significant and reliable results, (2) lack of cross validation (or a test set) for verification of the classifier/regression performance, (3) incorrect division of the spectral data into the training and the test/validation sets; (4) improper selection of the PC number to reduce the analyzed spectral data dimension.