论文标题
M-变异渐变学和描述符的唯一性
M-Variance Asymptotics and Uniqueness of Descriptors
论文作者
论文摘要
用于M估计问题的渐近理论通常集中于样品描述符的渐近收敛,定义为样本损失函数的最小化器。在这里,我们探讨了一个相关的问题,并为样本损失的最小值(M-变量的最小值)提出了渐近理论。由于损耗函数值始终是实际数字,因此M变量的渐近理论相对简单。即使在描述符的渐近学更为复杂的情况下,M-变量通常满足标准的中央限制定理,例如,如果在涂抹术的情况下,或者如果没有渐近分布,则如果描述符空间是一般度量的,则可以给出这种情况。我们使用M-变量的渐近结果来制定假设检验,以系统地确定给定样本是否具有多个全局最小值。我们讨论了我们的测试的三个应用到数据,每种应用都列出了可能发生非唯一性的典型情况。这些模型场景是非欧几里得空间,非线性回归和高斯混合物聚类的平均值。
Asymptotic theory for M-estimation problems usually focuses on the asymptotic convergence of the sample descriptor, defined as the minimizer of the sample loss function. Here, we explore a related question and formulate asymptotic theory for the minimum value of sample loss, the M-variance. Since the loss function value is always a real number, the asymptotic theory for the M-variance is comparatively simple. M-variance often satisfies a standard central limit theorem, even in situations where the asymptotics of the descriptor is more complicated as for example in case of smeariness, or if no asymptotic distribution can be given as can be the case if the descriptor space is a general metric space. We use the asymptotic results for the M-variance to formulate a hypothesis test to systematically determine for a given sample whether the underlying population loss function may have multiple global minima. We discuss three applications of our test to data, each of which presents a typical scenario in which non-uniqueness of descriptors may occur. These model scenarios are the mean on a non-euclidean space, non-linear regression and Gaussian mixture clustering.