对科学发现深度学习的调查

论文标题

对科学发现深度学习的调查

A Survey of Deep Learning for Scientific Discovery

论文作者

Raghu, Maithra, Schmidt, Eric

论文摘要

在过去的几年中，我们已经看到了机器学习核心问题的基本突破，这在很大程度上是由深层神经网络的进步驱动的。同时，在各种科学领域中收集的数据量在大小和复杂性上都大大增加。综上所述，这为科学环境中的深度学习应用提供了许多激动人心的机会。但是，对此的重大挑战就是只知道从哪里开始。不同深度学习技术的宽广和多样性使得难以确定这些方法可能最适合哪些科学问题，或者哪种特定方法可能提供最有希望的第一种方法。在这项调查中，我们专注于解决这一中心问题，概述了许多广泛使用的深度学习模型，涵盖了视觉，顺序和图形结构化数据，相关任务和不同的培训方法，以及将深度学习使用的技术与较少的数据一起使用，并更好地解释了这些复杂的模型 - 对于许多科学用例而言，两个中心考虑。我们还包括完整的设计过程，实施技巧以及指向众多教程，研究摘要和开放源深度学习管道和由社区开发的预处理模型的概述。我们希望这项调查将有助于加速在不同科学领域的深度学习。

Over the past few years, we have seen fundamental breakthroughs in core problems in machine learning, largely driven by advances in deep neural networks. At the same time, the amount of data collected in a wide array of scientific domains is dramatically increasing in both size and complexity. Taken together, this suggests many exciting opportunities for deep learning applications in scientific settings. But a significant challenge to this is simply knowing where to start. The sheer breadth and diversity of different deep learning techniques makes it difficult to determine what scientific problems might be most amenable to these methods, or which specific combination of methods might offer the most promising first approach. In this survey, we focus on addressing this central issue, providing an overview of many widely used deep learning models, spanning visual, sequential and graph structured data, associated tasks and different training methods, along with techniques to use deep learning with less data and better interpret these complex models --- two central considerations for many scientific use cases. We also include overviews of the full design process, implementation tips, and links to a plethora of tutorials, research summaries and open-sourced deep learning pipelines and pretrained models, developed by the community. We hope that this survey will help accelerate the use of deep learning across different scientific domains.

下载PDF全文

下载文献需遵守相关版权规定

论文标题