论文标题
H2-golden-Retriever:基于证据的氢研究技巧的方法和工具
H2-Golden-Retriever: Methodology and Tool for an Evidence-Based Hydrogen Research Grantsmanship
论文作者
论文摘要
氢气有望在脱矿经济中发挥重要作用。需要发现,开发和理解低成本,高性能,耐用的材料,这些材料可以帮助最大化电解成本,以及需要一种智能工具来使基于证据的氢研究资金决策相对容易地进行这项研究。在这项工作中,我们开发了H2金retiever(H2GR)用于使用自然语言的氢化知识和代表性(NLP)的H2 Golden ReTRiever(H2GR)系统,该系统。该系统代表了一种封装基于证据的研究授予技巧的最新技术的新方法。将相关的氢纸从网络上刮擦并索引,并使用噪声和止动,语言和咒语检查,茎和lematization进行预处理。 NLP任务包括使用Stanford和Spacy NER,使用潜在的Dirichlet分配和TF-IDF进行主题建模。知识图模块用于产生有意义的实体及其关系,相关H2论文中的趋势和模式,这要归功于氢生产领域的本体。决策情报组件为利益相关者提供了成本和数量依赖性的模拟环境。 Pagerank算法用于对感兴趣的论文进行划分。对拟议的H2GR进行了随机搜索,结果包括一份由相关得分,实体,实体之间的关系图,H2生产本体和因果决策图排名的论文列表。定性评估是由专家进行的,H2GR被认为可以达到令人满意的水平。
Hydrogen is poised to play a major role in decarbonizing the economy. The need to discover, develop, and understand low-cost, high-performance, durable materials that can help maximize the cost of electrolysis as well as the need for an intelligent tool to make evidence-based Hydrogen research funding decisions relatively easier warranted this study.In this work, we developed H2 Golden Retriever (H2GR) system for Hydrogen knowledge discovery and representation using Natural Language Processing (NLP), Knowledge Graph and Decision Intelligence. This system represents a novel methodology encapsulating state-of-the-art technique for evidence-based research grantmanship. Relevant Hydrogen papers were scraped and indexed from the web and preprocessing was done using noise and stop-words removal, language and spell check, stemming and lemmatization. The NLP tasks included Named Entity Recognition using Stanford and Spacy NER, topic modeling using Latent Dirichlet Allocation and TF-IDF. The Knowledge Graph module was used for the generation of meaningful entities and their relationships, trends and patterns in relevant H2 papers, thanks to an ontology of the hydrogen production domain. The Decision Intelligence component provides stakeholders with a simulation environment for cost and quantity dependencies. PageRank algorithm was used to rank papers of interest. Random searches were made on the proposed H2GR and the results included a list of papers ranked by relevancy score, entities, graphs of relationships between the entities, ontology of H2 production and Causal Decision Diagrams showing component interactivity. Qualitative assessment was done by the experts and H2GR is deemed to function to a satisfactory level.