论文标题

分析细胞加权数据

Analyzing cellwise weighted data

论文作者

Rousseeuw, Peter J.

论文摘要

数据集的行通常具有权重。例如,案例的重量可能反映了观察到的次数或其可靠性。为了分析此类数据,许多行加权技术都可以使用,最著名的是加权平均值。但是在某些情况下,数据矩阵的单个单元格(条目)具有分配给它们的权重。本注释的目的是提供一种分析此类数据的方法。我们定义了一个单元格加权的似然函数,该功能与我们称为解压缩的数据集的转换相对应。使用这种加权可能性,可以执行多元统计方法,例如最大似然估计和似然比测试。我们特别注意对协方差矩阵的估计,因为这些是多元统计数据的基础。提供了电池最大似然估计器的R实现,该估计量采用EM算法的版本。还提出了更快的近似方法,该方法渐近地等效于此。

Often the rows (cases, objects) of a dataset have weights. For instance, the weight of a case may reflect the number of times it has been observed, or its reliability. For analyzing such data many rowwise weighted techniques are available, the most well known being the weighted average. But there are also situations where the individual cells (entries) of the data matrix have weights assigned to them. The purpose of this note is to provide an approach to analyze such data. We define a cellwise weighted likelihood function, that corresponds to a transformation of the dataset which we call unpacking. Using this weighted likelihood one can carry out multivariate statistical methods such as maximum likelihood estimation and likelihood ratio tests. We pay particular attention to the estimation of covariance matrices, because these are the building blocks of much of multivariate statistics. An R implementation of the cellwise maximum likelihood estimator is provided, which employs a version of the EM algorithm. Also a faster approximate method is proposed, which is asymptotically equivalent to it.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源