论文标题

有效的截短线性回归,噪声差异未知

Efficient Truncated Linear Regression with Unknown Noise Variance

论文作者

Daskalakis, Constantinos, Stefanou, Patroklos, Yao, Rui, Zampetakis, Manolis

论文摘要

截断线性回归是统计学中的一个经典挑战,其中$ y = w^t x + \ varepsilon $及其相应的功能向量,$ x \ in \ sathbb {r}^k $,只有在某些子集中落在某个子集中$ s \ subseteq \ sueteq \ seteq \ mathbb {r mathbb {r} $;否则,对$(x,y)$的存在被隐藏在观察中。以截断的观测结果的线性回归一直是其一般形式的挑战,因为〜\ citet的早期作品{tobin1958估计,amemiya1973 reflectression}。当误差的分布与已知方差正常时,〜\ citet的最新工作{daskalakis2019 truncatedRegression}提供了线性模型的计算和统计上有效的估计器,$ w $。 在本文中,当噪声方差未知时,我们为截断的线性回归提供了第一个计算和统计上有效的估计器,同时估算了噪声的线性模型和方差。我们的估计器基于对截短样品的负模拟样本的有效实施,对预测的随机梯度下降。重要的是,我们表明我们的估计错误是渐近正常的,我们使用它来为我们的估计提供明确的置信区域。

Truncated linear regression is a classical challenge in Statistics, wherein a label, $y = w^T x + \varepsilon$, and its corresponding feature vector, $x \in \mathbb{R}^k$, are only observed if the label falls in some subset $S \subseteq \mathbb{R}$; otherwise the existence of the pair $(x, y)$ is hidden from observation. Linear regression with truncated observations has remained a challenge, in its general form, since the early works of~\citet{tobin1958estimation,amemiya1973regression}. When the distribution of the error is normal with known variance, recent work of~\citet{daskalakis2019truncatedregression} provides computationally and statistically efficient estimators of the linear model, $w$. In this paper, we provide the first computationally and statistically efficient estimators for truncated linear regression when the noise variance is unknown, estimating both the linear model and the variance of the noise. Our estimator is based on an efficient implementation of Projected Stochastic Gradient Descent on the negative log-likelihood of the truncated sample. Importantly, we show that the error of our estimates is asymptotically normal, and we use this to provide explicit confidence regions for our estimates.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源