带有嘈杂输入的多级高斯过程分类

论文标题

带有嘈杂输入的多级高斯过程分类

Multi-class Gaussian Process Classification with Noisy Inputs

论文作者

Villacampa-Calvo, Carlos, Zaldivar, Bryan, Garrido-Merchán, Eduardo C., Hernández-Lobato, Daniel

论文摘要

在机器学习社区中，假设所观察到的数据在输入属性中是无噪声的。然而，在实际问题中，带有输入噪声的方案很常见，因为测量绝不是完全准确的。如果未考虑此输入噪声，则有望进行监督的机器学习方法次优。在本文中，我们专注于多类分类问题，并使用高斯流程（GPS）作为基础分类器。由来自天体物理域的数据集的动机，我们假设观察到的数据可能包含输入中的噪声。因此，我们设计了几个可以解释输入噪声的多级GP分类器。可以使用变异推理有效地对此类分类器进行有效训练，以近似模型的潜在变量的后验分布。此外，在某些情况下，可以提前知道噪声的量。如果是这种情况，则可以在提出的方法中很容易引入。预计此先前的信息将带来更好的性能结果。我们通过进行几个涉及合成和真实数据的实验来评估所提出的方法。其中包括来自UCI存储库的几个数据集，MNIST数据集和来自天体物理学的数据集。获得的结果表明，尽管在方法之间的分类误差相似，但就测试对数类似性而言，所提出方法的预测分布比基于忽略输入噪声的GPS的分类器的预测分布更好。

It is a common practice in the machine learning community to assume that the observed data are noise-free in the input attributes. Nevertheless, scenarios with input noise are common in real problems, as measurements are never perfectly accurate. If this input noise is not taken into account, a supervised machine learning method is expected to perform sub-optimally. In this paper, we focus on multi-class classification problems and use Gaussian processes (GPs) as the underlying classifier. Motivated by a data set coming from the astrophysics domain, we hypothesize that the observed data may contain noise in the inputs. Therefore, we devise several multi-class GP classifiers that can account for input noise. Such classifiers can be efficiently trained using variational inference to approximate the posterior distribution of the latent variables of the model. Moreover, in some situations, the amount of noise can be known before-hand. If this is the case, it can be readily introduced in the proposed methods. This prior information is expected to lead to better performance results. We have evaluated the proposed methods by carrying out several experiments, involving synthetic and real data. These include several data sets from the UCI repository, the MNIST data set and a data set coming from astrophysics. The results obtained show that, although the classification error is similar across methods, the predictive distribution of the proposed methods is better, in terms of the test log-likelihood, than the predictive distribution of a classifier based on GPs that ignores input noise.

下载PDF全文

下载文献需遵守相关版权规定

论文标题