论文标题
从化学工程数据集中最大化信息:机器学习的应用
Maximizing information from chemical engineering data sets: Applications to machine learning
论文作者
论文摘要
有充分记录的人工智能如何(并且已经在)对化学工程产生重大影响。但是,对于许多化学工程应用,经典的机器学习方法可能很弱。这篇综述讨论了化学工程应用中如何出现具有挑战性的数据特征。我们确定化学工程应用中产生的数据的四个特征,这些数据使得应用经典人工智能方法的应用很难:(1)高方差,低量数据,(2)较低的差异,高量数据,(3)嘈杂/损坏/缺失/缺失数据以及(4)具有基于物理学限制的限制数据。对于这四个数据特征中的每一个,我们讨论了这些数据特征出现的应用程序,并展示了当前的化学工程研究如何扩展数据科学和机器学习领域以纳入这些挑战。最后,我们确定了未来研究的几个挑战。
It is well-documented how artificial intelligence can have (and already is having) a big impact on chemical engineering. But classical machine learning approaches may be weak for many chemical engineering applications. This review discusses how challenging data characteristics arise in chemical engineering applications. We identify four characteristics of data arising in chemical engineering applications that make applying classical artificial intelligence approaches difficult: (1) high variance, low volume data, (2) low variance, high volume data, (3) noisy/corrupt/missing data, and (4) restricted data with physics-based limitations. For each of these four data characteristics, we discuss applications where these data characteristics arise and show how current chemical engineering research is extending the fields of data science and machine learning to incorporate these challenges. Finally, we identify several challenges for future research.