论文标题
通过解释性变异自动编码器生成三级蛋白质结构
Generating Tertiary Protein Structures via an Interpretative Variational Autoencoder
论文作者
论文摘要
跨学科的许多科学探究建立在与功能相关的动态系统的机械处理基础上。一个高度可见的实例是在分子生物学中,其中一个重要的目标是确定蛋白质分子在活细胞中与分子伙伴相互作用的功能相关的形式/结构。通常,在随机优化的保护下,使用优化评分函数的算法实现了这个目标。研究反复表明,当前的评分功能虽然稳步改善,但与分子活性微弱相关。受生成深度学习的最新动力的启发,本文提出并评估了一种替代蛋白质与功能相关的三维结构的替代方法。尽管通常具有高度结构化数据的深度生成模型,但此处介绍的工作通过图生成模型避免了这一挑战。对几种深度体系结构的全面评估表明,生成模型的希望直接揭示了采样新型三级结构的潜在空间,以及突出显示具有结构含义的轴/因素,并打开了与深层模型相关的黑匣子。这里介绍的工作是迈向解释性,深层生成模型的第一步,成为蛋白质结构预测的可行和信息丰富的补充方法。
Much scientific enquiry across disciplines is founded upon a mechanistic treatment of dynamic systems that ties form to function. A highly visible instance of this is in molecular biology, where an important goal is to determine functionally-relevant forms/structures that a protein molecule employs to interact with molecular partners in the living cell. This goal is typically pursued under the umbrella of stochastic optimization with algorithms that optimize a scoring function. Research repeatedly shows that current scoring function, though steadily improving, correlate weakly with molecular activity. Inspired by recent momentum in generative deep learning, this paper proposes and evaluates an alternative approach to generating functionally-relevant three-dimensional structures of a protein. Though typically deep generative models struggle with highly-structured data, the work presented here circumvents this challenge via graph-generative models. A comprehensive evaluation of several deep architectures shows the promise of generative models in directly revealing the latent space for sampling novel tertiary structures, as well as in highlighting axes/factors that carry structural meaning and open the black box often associated with deep models. The work presented here is a first step towards interpretative, deep generative models becoming viable and informative complementary approaches to protein structure prediction.