论文标题
使用AI游戏算法加速共聚物逆设计
Accelerating Copolymer Inverse Design using AI Gaming algorithm
论文作者
论文摘要
存在广泛的测序问题,例如,在蛋白质和聚合物中,可以作为启发式搜索算法配制,涉及制定类似于计算机游戏的决策。 AI游戏算法(例如Monte Carlo Tree Search(MCT))在计算机GO游戏中表现出色,并且是旨在识别政策应采取的道路(移动)以达到最终获胜或最佳解决方案的决策树。反测序问题的主要挑战是,材料搜索空间非常庞大,每个序列的属性评估在计算上都需要。因此,通过最大程度地减少给定设计周期中评估总数来达到最佳解决方案。我们证明,可以通过开发和种植决策树来采用这种方法来解决测序问题,其中树中的每个节点都是候选序列,其适应性是通过分子模拟直接评估的。我们与MD仿真接口MCT,并使用一个代表性的设计共聚物相兼化剂的代表性示例,其中的目标是识别序列特定的共聚物,从而导致两个不混溶的均聚物之间导致零界面能量。我们将MCTS算法应用于聚合物链长度,从10-MER到30-Mer不等,其中总体搜索空间从210(1024)到230(约10亿)不等。在每种情况下,我们都会确定一个目标序列,该目标序列在数百个评估中导致零界面能量,以证明MCT在探索实用材料设计问题中具有极大的化学/材料搜索空间的实用材料设计问题。我们的MCTS-MD框架可以很容易地扩展到其他几种聚合物和蛋白质逆设计问题,特别是对于序列 - 掌握数据不可用和/或资源密集的情况。
There exists a broad class of sequencing problems, for example, in proteins and polymers that can be formulated as a heuristic search algorithm that involve decision making akin to a computer game. AI gaming algorithms such as Monte Carlo tree search (MCTS) gained prominence after their exemplary performance in the computer Go game and are decision trees aimed at identifying the path (moves) that should be taken by the policy to reach the final winning or optimal solution. Major challenges in inverse sequencing problems are that the materials search space is extremely vast and property evaluation for each sequence is computationally demanding. Reaching an optimal solution by minimizing the total number of evaluations in a given design cycle is therefore highly desirable. We demonstrate that one can adopt this approach for solving the sequencing problem by developing and growing a decision tree, where each node in the tree is a candidate sequence whose fitness is directly evaluated by molecular simulations. We interface MCTS with MD simulations and use a representative example of designing a copolymer compatibilizer, where the goal is to identify sequence specific copolymers that lead to zero interfacial energy between two immiscible homopolymers. We apply the MCTS algorithm to polymer chain lengths varying from 10-mer to 30-mer, wherein the overall search space varies from 210 (1024) to 230 (~1 billion). In each case, we identify a target sequence that leads to zero interfacial energy within a few hundred evaluations demonstrating the scalability and efficiency of MCTS in exploring practical materials design problems with exceedingly vast chemical/material search space. Our MCTS-MD framework can be easily extended to several other polymer and protein inverse design problems, in particular, for cases where sequence-property data is either unavailable and/or is resource intensive.