物理学中的机器学习：中毒训练集的陷阱

论文标题

物理学中的机器学习：中毒训练集的陷阱

Machine learning in physics: The pitfalls of poisoned training sets

论文作者

Fang, Chao, Barzegar, Amin, Katzgraber, Helmut G.

论文摘要

人工神经网络以识别数据中隐藏模式的能力而闻名，是最强大的机器学习工具之一。最值得注意的是，神经网络在识别凝结物理物理学的物质和相变状态方面发挥了核心作用。迄今为止，大多数研究都集中在已知物质及其相变阶段的系统上，因此神经网络的性能得到了很好的控制。尽管神经网络提出了一种令人兴奋的新工具来检测物质的新阶段，但我们在这里证明，当训练集被毒害（即训练数据差或标记数据错误）时，神经网络很容易做出误导性的预测。

Known for their ability to identify hidden patterns in data, artificial neural networks are among the most powerful machine learning tools. Most notably, neural networks have played a central role in identifying states of matter and phase transitions across condensed matter physics. To date, most studies have focused on systems where different phases of matter and their phase transitions are known, and thus the performance of neural networks is well controlled. While neural networks present an exciting new tool to detect new phases of matter, here we demonstrate that when the training sets are poisoned (i.e., poor training data or mislabeled data) it is easy for neural networks to make misleading predictions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题