论文标题

从代码和文本中的可计算科学模型的自动创建和人为辅助的策划

Automated Creation and Human-assisted Curation of Computable Scientific Models from Code and Text

论文作者

Mulwad, Varish, Crapo, Andrew, Kumar, Vijay S., Jobin, James, Gabaldon, Alfredo, Virani, Nurali, Dixit, Sharad, Joshi, Narendra

论文摘要

科学模型是更好地理解和预测复杂系统行为的关键。科学模型的最全面的表现,包括支撑其可用性的关键假设和参数,通常嵌入相关的源代码和文档中,这些源代码和文档可能采用各种(可能过时的)编程实践和语言。如果域专家不熟悉该代码,则无法完全了解他们的实施。此外,快速的研发迭代使得与不断发展的科学模型代码库保持一致。为了应对这些挑战,我们为可计算科学模型的知识图的自动创建和人为辅助策划开发了一个系统,该模型在任何相关的内联注释和外部文档的背景下分析了模型的代码。我们的系统使用知识驱动的以及数据驱动的方法来识别和从文本文档中的方程式识别和提取相关概念,从文本文档到语义上使用域术语进行注释模型。这些模型被转换为可执行的Python函数,然后可以进一步组成复杂的工作流以回答不同形式的域驱动问题。我们介绍了使用来自NASA的Hypersonic空气动力学网站的代码数据集和相关文本获得的实验结果。

Scientific models hold the key to better understanding and predicting the behavior of complex systems. The most comprehensive manifestation of a scientific model, including crucial assumptions and parameters that underpin its usability, is usually embedded in associated source code and documentation, which may employ a variety of (potentially outdated) programming practices and languages. Domain experts cannot gain a complete understanding of the implementation of a scientific model if they are not familiar with the code. Furthermore, rapid research and development iterations make it challenging to keep up with constantly evolving scientific model codebases. To address these challenges, we develop a system for the automated creation and human-assisted curation of a knowledge graph of computable scientific models that analyzes a model's code in the context of any associated inline comments and external documentation. Our system uses knowledge-driven as well as data-driven approaches to identify and extract relevant concepts from code and equations from textual documents to semantically annotate models using domain terminology. These models are converted into executable Python functions and then can further be composed into complex workflows to answer different forms of domain-driven questions. We present experimental results obtained using a dataset of code and associated text derived from NASA's Hypersonic Aerodynamics website.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源