论文标题
具有生成替代模型的主动功能获取
Active Feature Acquisition with Generative Surrogate Models
论文作者
论文摘要
许多现实世界中的情况允许在使用有限或不确定数据进行评估时获取其他相关信息。但是,传统的ML方法要么需要事先获得所有功能,要么将其中一部分视为无法获取的缺少数据。在这项工作中,我们考虑执行主动特征获取(AFA)的模型,并查询环境是否未观察到的功能,以改善评估时的预测评估。我们的工作重新制定了马尔可夫决策过程(MDP),该过程将AFA问题作为一项生成建模任务,并通过基于新型模型的方法优化策略。我们建议学习一种生成替代模型(GSM),该模型捕获输入特征之间的依赖项,以评估从获取中的潜在信息获得。利用GSM来提供中间的奖励和辅助信息,以帮助代理商导航复杂的高维操作空间和稀疏的奖励。此外,对于无监督的情况,我们在一个任务中扩展了AFA,即目标变量是未观察到的特征本身,而目标是以一种成本效益的方式收集特定实例的信息。经验结果表明,在监督和无监督任务上,我们的方法比以前的艺术方法的表现要好得多。
Many real-world situations allow for the acquisition of additional relevant information when making an assessment with limited or uncertain data. However, traditional ML approaches either require all features to be acquired beforehand or regard part of them as missing data that cannot be acquired. In this work, we consider models that perform active feature acquisition (AFA) and query the environment for unobserved features to improve the prediction assessments at evaluation time. Our work reformulates the Markov decision process (MDP) that underlies the AFA problem as a generative modeling task and optimizes a policy via a novel model-based approach. We propose learning a generative surrogate model (GSM) that captures the dependencies among input features to assess potential information gain from acquisitions. The GSM is leveraged to provide intermediate rewards and auxiliary information to aid the agent navigate a complicated high-dimensional action space and sparse rewards. Furthermore, we extend AFA in a task we coin active instance recognition (AIR) for the unsupervised case where the target variables are the unobserved features themselves and the goal is to collect information for a particular instance in a cost-efficient way. Empirical results demonstrate that our approach achieves considerably better performance than previous state of the art methods on both supervised and unsupervised tasks.