通过元强化学习和贝叶斯优化设计生物序列

论文标题

通过元强化学习和贝叶斯优化设计生物序列

Designing Biological Sequences via Meta-Reinforcement Learning and Bayesian Optimization

论文作者

Feng, Leo, Nouri, Padideh, Muni, Aneri, Bengio, Yoshua, Bacon, Pierre-Luc

论文摘要

加速生物序列设计的能力可能会对医疗领域的进度产生重大影响。该问题可以作为一个全球优化问题，在该问题中，该目标是昂贵的黑盒功能，因此我们可以查询限制大量批量的大批次，但限制了较少的回合。贝叶斯优化是解决此问题的原则方法。然而，生物序列的天文范围较大的状态空间使所有可能的序列都无法迭代蛮力。在本文中，我们提出了Metarlbo，在其中通过元强化学习训练自回归生成模型，以提出有希望的序列，以通过贝叶斯优化选择。我们提出了这个问题，因为它是在上一轮中获取的数据的采样子集引起的MDP分布上找到最佳策略的问题。我们的内部实验表明，与现有强大基准相比，对此类合奏的元学习提供了鲁棒性，可抵抗奖励错误指定和实现竞争成果。

The ability to accelerate the design of biological sequences can have a substantial impact on the progress of the medical field. The problem can be framed as a global optimization problem where the objective is an expensive black-box function such that we can query large batches restricted with a limitation of a low number of rounds. Bayesian Optimization is a principled method for tackling this problem. However, the astronomically large state space of biological sequences renders brute-force iterating over all possible sequences infeasible. In this paper, we propose MetaRLBO where we train an autoregressive generative model via Meta-Reinforcement Learning to propose promising sequences for selection via Bayesian Optimization. We pose this problem as that of finding an optimal policy over a distribution of MDPs induced by sampling subsets of the data acquired in the previous rounds. Our in-silico experiments show that meta-learning over such ensembles provides robustness against reward misspecification and achieves competitive results compared to existing strong baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题