论文标题

通过稀疏性促进变压器表示的概率热稳定性预测

Probabilistic thermal stability prediction through sparsity promoting transformer representation

论文作者

Zainchkovskyy, Yevgen, Ferkinghoff-Borg, Jesper, Bennett, Anja, Egebjerg, Thomas, Lorenzen, Nikolai, Greisen, Per Jr., Hauberg, Søren, Stahlhut, Carsten

论文摘要

预训练的蛋白质语言模型在不同的蛋白质工程任务中表现出显着适用性。这些预训练的变压器模型潜在表示的一般用法是使用跨残基位置的平均池来减少特征维度,以进一步下游任务,例如预测生物体物理属性或其他功能行为。在本文中,我们为机器学习(ML)驱动的药物设计提供了两倍的贡献。首先,我们通过促进预先训练的变压器模型的惩罚来证明稀疏性的力量,以确保单链可变片段的更稳定和准确的熔化温度(TM)预测,平均绝对误差为0.23c。其次,我们演示了在概率框架中构建预测问题的力量。具体来说,我们倡导需要采用概率框架,尤其是在ML驱动的药物设计的背景下。

Pre-trained protein language models have demonstrated significant applicability in different protein engineering task. A general usage of these pre-trained transformer models latent representation is to use a mean pool across residue positions to reduce the feature dimensions to further downstream tasks such as predicting bio-physics properties or other functional behaviours. In this paper we provide a two-fold contribution to machine learning (ML) driven drug design. Firstly, we demonstrate the power of sparsity by promoting penalization of pre-trained transformer models to secure more robust and accurate melting temperature (Tm) prediction of single-chain variable fragments with a mean absolute error of 0.23C. Secondly, we demonstrate the power of framing our prediction problem in a probabilistic framework. Specifically, we advocate for the need of adopting probabilistic frameworks especially in the context of ML driven drug design.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源