对抗性的例子和更深的归纳之谜：深度学习中的文物理论的需求

论文标题

对抗性的例子和更深的归纳之谜：深度学习中的文物理论的需求

Adversarial Examples and the Deeper Riddle of Induction: The Need for a Theory of Artifacts in Deep Learning

论文作者

Buckner, Cameron

论文摘要

深度学习目前是人工智能中最广泛，最成功的技术。它有望将科学发现的边界推向当前限制。但是，怀疑论者担心深层神经网络是黑匣子，并提出质疑，如果人类无法理解它们，这些进步是否真的可以认为是科学进步。相关的是，这些系统还具有令人困惑的新漏洞：最著名的是对“对抗性示例”的敏感性。在本文中，我认为对抗性例子将成为哲学和多样化科学辩论的闪点。具体而言，有关对抗性例子的新发现挑战了共识观点，即网络对这些情况的判决是由于训练集中过度适合特质噪声而引起的，而相反，可能是检测到预测有用的“数据几何特征”的结果，“人类无法感知（Ilyas et e Yilyas等）。这些结果应该导致我们对哲学与科学交汇处最深层的难题之一进行重新检查：尼尔森·古德曼（Nelson Goodman）的“新谜语”。具体而言，它们提出了许多科学的进展将取决于人类认为难以理解的有用特征的检测和操纵的可能性。但是，在评估这种可能性之前，我们必须确定这些难以理解的特征中的哪些是真实的，但仅可用于“外星人”的感知和认知，哪些特征是对镜头或gibbs现象（如镜头或Gibbs现象）的深度学习 - 造物的独特性伪像，对于预测通常是相似的，但通常可以看作是科学化的。因此，机器学习研究人员迫切需要为深层神经网络开发一种文物理论，我通过为研究领域的一些初始方向勾勒出一些最初的方向而得出结论。

Deep learning is currently the most widespread and successful technology in artificial intelligence. It promises to push the frontier of scientific discovery beyond current limits. However, skeptics have worried that deep neural networks are black boxes, and have called into question whether these advances can really be deemed scientific progress if humans cannot understand them. Relatedly, these systems also possess bewildering new vulnerabilities: most notably a susceptibility to "adversarial examples". In this paper, I argue that adversarial examples will become a flashpoint of debate in philosophy and diverse sciences. Specifically, new findings concerning adversarial examples have challenged the consensus view that the networks' verdicts on these cases are caused by overfitting idiosyncratic noise in the training set, and may instead be the result of detecting predictively useful "intrinsic features of the data geometry" that humans cannot perceive (Ilyas et al., 2019). These results should cause us to re-examine responses to one of the deepest puzzles at the intersection of philosophy and science: Nelson Goodman's "new riddle" of induction. Specifically, they raise the possibility that progress in a number of sciences will depend upon the detection and manipulation of useful features that humans find inscrutable. Before we can evaluate this possibility, however, we must decide which (if any) of these inscrutable features are real but available only to "alien" perception and cognition, and which are distinctive artifacts of deep learning-for artifacts like lens flares or Gibbs phenomena can be similarly useful for prediction, but are usually seen as obstacles to scientific theorizing. Thus, machine learning researchers urgently need to develop a theory of artifacts for deep neural networks, and I conclude by sketching some initial directions for this area of research.

下载PDF全文

下载文献需遵守相关版权规定

论文标题