在学习在不可交流的环境中线性回归的混合

论文标题

在学习在不可交流的环境中线性回归的混合

On Learning Mixture of Linear Regressions in the Non-Realizable Setting

论文作者

Ghosh, Avishek, Mazumdar, Arya, Pal, Soumyabrata, Sen, Rajat

论文摘要

虽然线性回归（MLR）的混合物是一个经过充分研究的主题，但先前的工作通常不会分析此类模型的预测误差。实际上，在混合物的背景下，{\ em预测}和{\ em损失}并不明确。在本文中，首先我们证明可以将MLR用于预测，而不是预测标签，而是预测值列表（也称为{\ em list-decoding}）。列表大小等于混合物中的组件数量，并且损耗函数定义为所有组件模型所产生的损耗中的最小值。我们表明，通过此定义，一种实证风险最小化（ERM）的解决方案实现了预测误差的较小可能性。这促使算法最大程度地减少MLR的经验风险，这在计算上很难。先前的算法在MLR中起作用，重点关注{\ Emable}的设置，即，当数据是由混合线性（嘈杂）模型概率生成的数据时，参数的恢复。在本文中，我们表明，在数据集和初始点上的某些规律性条件下，即使没有假定可实现的模型，即使没有假定可实现的模型，流行的交流最小化（AM）算法也可以找到数据集中的最佳拟合线路，从而为ERM提供了解决方案。我们进一步提供了一种算法，该算法在数据点数中以多项式时间运行，并恢复了最佳拟合线的良好近似值。这两种算法进行了实验比较。

While mixture of linear regressions (MLR) is a well-studied topic, prior works usually do not analyze such models for prediction error. In fact, {\em prediction} and {\em loss} are not well-defined in the context of mixtures. In this paper, first we show that MLR can be used for prediction where instead of predicting a label, the model predicts a list of values (also known as {\em list-decoding}). The list size is equal to the number of components in the mixture, and the loss function is defined to be minimum among the losses resulted by all the component models. We show that with this definition, a solution of the empirical risk minimization (ERM) achieves small probability of prediction error. This begs for an algorithm to minimize the empirical risk for MLR, which is known to be computationally hard. Prior algorithmic works in MLR focus on the {\em realizable} setting, i.e., recovery of parameters when data is probabilistically generated by a mixed linear (noisy) model. In this paper we show that a version of the popular alternating minimization (AM) algorithm finds the best fit lines in a dataset even when a realizable model is not assumed, under some regularity conditions on the dataset and the initial points, and thereby provides a solution for the ERM. We further provide an algorithm that runs in polynomial time in the number of datapoints, and recovers a good approximation of the best fit lines. The two algorithms are experimentally compared.

下载PDF全文

下载文献需遵守相关版权规定

论文标题