论文标题

依赖性较弱的信号包容性和全基因组关联研究中的应用

Weak Signal Inclusion Under Dependence and Applications in Genome-wide Association Study

论文作者

Jeng, X. Jessie, Hu, Yifei, Sun, Quan, Li, Yun

论文摘要

在全基因组关联研究(GWASS)中对弱信号的询问的激励,我们考虑了保留真正信号的问题,而这些信号的问题不足以与大量噪声单独分开。我们从虚假负面对照的角度解决了挑战,并提出了虚假负面对照(FNC)筛选,这是一种数据驱动的方法,可在用户指定的级别上有效调节假阴性比例。 FNC筛选是在变量之间的任意协方差依赖性的现实环境中开发的。我们通过一个参数校准总体依赖性,该参数与高维稀疏推理中现有的相图兼容。利用新的校准,我们渐近地解释了协方差依赖性,信号稀疏性和信号强度对所提出方法的关节效应。我们使用新的相图解释结果,这表明FNC筛选可以有效地选择一组候选变量,即使信号不能与噪声单独分开,也可以保留高比例的信号。将FNC筛查的有限样本性能与模拟研究中现有方法的几种现有方法进行了比较。所提出的方法在适应用户指定的假阴性控制级别方面优于其他方法。我们实施FNC筛选以增强两阶段的GWAS程序,该程序在实际应用中使用有限的样本大小时证明了大量功率增益。

Motivated by the inquiries of weak signals in underpowered genome-wide association studies (GWASs), we consider the problem of retaining true signals that are not strong enough to be individually separable from a large amount of noise. We address the challenge from the perspective of false negative control and present false negative control (FNC) screening, a data-driven method to efficiently regulate false negative proportion at a user-specified level. FNC screening is developed in a realistic setting with arbitrary covariance dependence between variables. We calibrate the overall dependence through a parameter whose scale is compatible with the existing phase diagram in high-dimensional sparse inference. Utilizing the new calibration, we asymptotically explicate the joint effect of covariance dependence, signal sparsity, and signal intensity on the proposed method. We interpret the results using a new phase diagram, which shows that FNC screening can efficiently select a set of candidate variables to retain a high proportion of signals even when the signals are not individually separable from noise. Finite sample performance of FNC screening is compared to those of several existing methods in simulation studies. The proposed method outperforms the others in adapting to a user-specified false negative control level. We implement FNC screening to empower a two-stage GWAS procedure, which demonstrates substantial power gain when working with limited sample sizes in real applications.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源