论文标题
分子属性预测的自适应不变性
Adaptive Invariance for Molecule Property Prediction
论文作者
论文摘要
有效的财产预测方法可以通过准确的核中筛查或有效指导正在进行的尺度实验努力来帮助加速搜索COVID-19抗病人。但是,现有的预测工具具有适应当前可用的稀缺或分散培训数据的能力有限。在本文中,我们引入了一种新的方法来学习可以推广或推断出超越异质数据的预测因子。我们的方法基于并扩展了最近提出的不变风险最小化,并适应预测因子以避免滋扰变化。我们通过不断地行使和操纵分子的潜在表示来实现这一目标,以突出预测因子的不良变化。为了测试该方法,我们使用三个数据源的组合:SARS-COV-2抗病毒筛选数据,与SARS-COV-2主要蛋白酶结合的分子片段以及SARS-COV-1的大型筛选数据。我们的预测指标的表现优于最先进的转移学习方法。我们还报告了我们关于广泛药物重新利用中心的模型的前20个预测。
Effective property prediction methods can help accelerate the search for COVID-19 antivirals either through accurate in-silico screens or by effectively guiding on-going at-scale experimental efforts. However, existing prediction tools have limited ability to accommodate scarce or fragmented training data currently available. In this paper, we introduce a novel approach to learn predictors that can generalize or extrapolate beyond the heterogeneous data. Our method builds on and extends recently proposed invariant risk minimization, adaptively forcing the predictor to avoid nuisance variation. We achieve this by continually exercising and manipulating latent representations of molecules to highlight undesirable variation to the predictor. To test the method we use a combination of three data sources: SARS-CoV-2 antiviral screening data, molecular fragments that bind to SARS-CoV-2 main protease and large screening data for SARS-CoV-1. Our predictor outperforms state-of-the-art transfer learning methods by significant margin. We also report the top 20 predictions of our model on Broad drug repurposing hub.