论文标题

部分可观测时空混沌系统的无模型预测

Identification of feasible pathway information for c-di-GMP binding proteins in cellulose production

论文作者

Hassan, Syeda Sakira, Mangayil, Rahul, Aho, Tommi, Yli-Harja, Olli, Karp, Matti

论文摘要

在本文中,我们利用一种机器学习方法来确定C-DI-GMP信号传导蛋白的重要途径。该数据集涉及1024个细菌基因组的12个途径和5个必需C-DI-GMP结合结构域的基因计数。已经应用了两种新颖的方法,即至少绝对收缩和选择操作员(Lasso)和随机森林,用于分析和建模数据集。两种方法都表明,细菌趋化性是C-DI-GMP编码域的最重要的途径。尽管在特征选择方面很受欢迎,但LASSO方法的强正规化无法将任何通往MSHE域的途径相关联。分析的结果可能有助于理解和强调细菌纤维素产生的支持途径。这些发现表明,需要通过停用纤维素产生的选择性途径来限制行为或功能。

In this paper, we utilize a machine learning approach to identify the significant pathways for c-di-GMP signaling proteins. The dataset involves gene counts from 12 pathways and 5 essential c-di-GMP binding domains for 1024 bacterial genomes. Two novel approaches, Least absolute shrinkage and selection operator (Lasso) and Random forests, have been applied for analyzing and modeling the dataset. Both approaches show that bacterial chemotaxis is the most essential pathway for c-di-GMP encoding domains. Though popular for feature selection, the strong regularization of Lasso method fails to associate any pathway to MshE domain. Results from the analysis may help to understand and emphasize the supporting pathways involved in bacterial cellulose production. These findings demonstrate the need for a chassis to restrict the behavior or functionality by deactivating the selective pathways in cellulose production.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源