论文标题
深度参考先生:预算模型的最佳方法是什么?
Deep Reference Priors: What is the best way to pretrain a model?
论文作者
论文摘要
利用额外数据的最佳方法是什么(无论是来自同一任务的未标记数据,还是从相关任务中标记的数据)来学习给定的任务?本文使用参考研究理论对问题进行正式化。参考先验是客观的,非信息性的贝叶斯先验,可最大程度地提高任务和模型权重之间的相互信息。这样的先验使该任务能够最大程度地影响贝叶斯后部,例如,参考先验取决于可用于学习任务的样本数量,对于非常小的样本量,先前的概率质量更大,在假设空间中的低复杂模型上都有更多的概率质量。本文介绍了中等尺度深网和基于图像的数据的参考先验的首次演示。我们开发了参考先验的概括,并向两个问题展示了应用。首先,通过使用未标记的数据来计算参考之前,我们开发了新的贝叶斯半监督学习方法,即使每个类别的样本很少,它们仍然有效。其次,通过使用来自源任务的标记数据来计算参考之前,我们开发了一种新的转移学习方法,该方法允许从目标任务进行数据以最大程度地影响贝叶斯后验。这些方法的经验验证是在图像分类数据集中进行的。代码可在https://github.com/grasp-lyrl/deep_reference_priors上找到。
What is the best way to exploit extra data -- be it unlabeled data from the same task, or labeled data from a related task -- to learn a given task? This paper formalizes the question using the theory of reference priors. Reference priors are objective, uninformative Bayesian priors that maximize the mutual information between the task and the weights of the model. Such priors enable the task to maximally affect the Bayesian posterior, e.g., reference priors depend upon the number of samples available for learning the task and for very small sample sizes, the prior puts more probability mass on low-complexity models in the hypothesis space. This paper presents the first demonstration of reference priors for medium-scale deep networks and image-based data. We develop generalizations of reference priors and demonstrate applications to two problems. First, by using unlabeled data to compute the reference prior, we develop new Bayesian semi-supervised learning methods that remain effective even with very few samples per class. Second, by using labeled data from the source task to compute the reference prior, we develop a new pretraining method for transfer learning that allows data from the target task to maximally affect the Bayesian posterior. Empirical validation of these methods is conducted on image classification datasets. Code is available at https://github.com/grasp-lyrl/deep_reference_priors.