论文标题

具有多尺度深生成模型的状态特异性蛋白质配体复杂结构预测

State-specific protein-ligand complex structure prediction with a multi-scale deep generative model

论文作者

Qiao, Zhuoran, Nie, Weili, Vahdat, Arash, Miller III, Thomas F., Anandkumar, Anima

论文摘要

由蛋白质和小分子配体形成的结合复合物无处不在,对生命至关重要。尽管蛋白质结构预测的最新进展,但现有的算法迄今仍无法系统地预测结合配体结构以及它们对蛋白质折叠的调节作用。为了解决这一差异,我们提出了神经质体,这是一种计算方法,可以直接使用蛋白质序列和配体分子图输入直接预测蛋白质配体复合物结构。 NeuralPlexer采用了深层生成模型来采样结合复合物的3D结构及其在原子分辨率下的构象变化。该模型基于一个扩散过程,该过程结合了必需的生物物理约束和一个多尺度的几何深度学习系统,以迭代样本取代残基级接触图和所有重原子坐标。与所有现有方法相比,神经质体可以达到最先进的性能,以蛋白质 - 配体盲型码头和柔性结合位点结构恢复。此外,由于其在对无配体状态和配体结合的合奏中取样的特异性,神经复合体在两种代表性的结构对上均以较大的构象变化(平均TM得分= 0.93)(平均确定的pote prote蛋白)(平均确定的prote蛋白(平均确定的prote)(平均确定的TM-SCORE),NeuralPlexer在全球蛋白质结构配对上始终优于Alphafold2。案例研究表明,预测的构象变化与重要目标的结构确定实验一致,包括人类KRAS $^\ textrm {G12C} $,酮酸酸性还原异构体和嘌呤GPCR。我们的研究表明,数据驱动的方法可以捕获蛋白质和小分子之间的结构合作,从而在加速酶,药物分子以及其他方面的设计方面有希望。

The binding complexes formed by proteins and small molecule ligands are ubiquitous and critical to life. Despite recent advancements in protein structure prediction, existing algorithms are so far unable to systematically predict the binding ligand structures along with their regulatory effects on protein folding. To address this discrepancy, we present NeuralPLexer, a computational approach that can directly predict protein-ligand complex structures solely using protein sequence and ligand molecular graph inputs. NeuralPLexer adopts a deep generative model to sample the 3D structures of the binding complex and their conformational changes at an atomistic resolution. The model is based on a diffusion process that incorporates essential biophysical constraints and a multi-scale geometric deep learning system to iteratively sample residue-level contact maps and all heavy-atom coordinates in a hierarchical manner. NeuralPLexer achieves state-of-the-art performance compared to all existing methods on benchmarks for both protein-ligand blind docking and flexible binding site structure recovery. Moreover, owing to its specificity in sampling both ligand-free-state and ligand-bound-state ensembles, NeuralPLexer consistently outperforms AlphaFold2 in terms of global protein structure accuracy on both representative structure pairs with large conformational changes (average TM-score=0.93) and recently determined ligand-binding proteins (average TM-score=0.89). Case studies reveal that the predicted conformational variations are consistent with structure determination experiments for important targets, including human KRAS$^\textrm{G12C}$, ketol-acid reductoisomerase, and purine GPCRs. Our study suggests that a data-driven approach can capture the structural cooperativity between proteins and small molecules, showing promise in accelerating the design of enzymes, drug molecules, and beyond.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源