论文标题
对套索及其导数的批判性综述,用于在协变量之间依赖性的可变选择
A critical review of LASSO and its derivatives for variable selection under dependence among covariates
论文作者
论文摘要
当协变量之间存在依赖性结构时,我们将众所周知的LASSO回归作为可变选择器的局限性。我们用$ n \ geq p $和$ p> n $的高维框架分析了经典状况。分析了这种方法的限制性特性,以保证最佳性以及实践中的不便。这些缺点的示例是通过广泛的模拟研究来显示的,利用了不同的依赖场景。为了寻求改进,与套索衍生物和替代方案进行了广泛的比较。最终,我们在数据本质方面提供了一些指导。
We study the limitations of the well known LASSO regression as a variable selector when there exists dependence structures among covariates. We analyze both the classic situation with $n\geq p$ and the high dimensional framework with $p>n$. Restrictive properties of this methodology to guarantee optimality, as well as the inconveniences in practice, are analyzed. Examples of these drawbacks are showed by means of a extensive simulation study, making use of different dependence scenarios. In order to search for improvements, a broad comparison with LASSO derivatives and alternatives is carried out. Eventually, we give some guidance about what procedures are the best in terms of the data nature.