论文标题

时间序列的因果链接估计在隐藏混杂下使用仿制干预措施

Time Series Causal Link Estimation under Hidden Confounding using Knockoff Interventions

论文作者

Trifunov, Violeta Teodora, Shadaydeh, Maha, Denzler, Joachim

论文摘要

潜在变量通常掩盖了观察数据中的原因效应关系,从而引起可能被误解为因果关系的虚假链接。这个问题激发了对气候科学和经济学等领域的极大兴趣。我们建议使用顺序因果效应变异自动编码器(SCEVAE)估算时间序列的混淆因果关系,同时应用了仿冒干预措施。仿冒变量具有与原始变量相同的分布,并保留与其他变量的相关性。这允许对观察分布更忠实的反事实。我们通过将SCEVAE应用于具有线性和非线性因果链接的合成数据集,从而显示了仿冒干预措施的优势。此外,我们将带有仿基的SCEVAE应用于真实的气溶胶气候观察时间序列数据。我们将合成数据的结果与有和没有估计混杂因素的时间序列变形方法的结果进行比较。我们表明,我们的方法通过将这两种方法与地面真理进行比较来优于这种基准。对于实际数据分析,我们依靠因果关系的专家知识,并证明使用合适的代理变量如何在存在隐藏的混杂因素的存在下改善因果链接估计。

Latent variables often mask cause-effect relationships in observational data which provokes spurious links that may be misinterpreted as causal. This problem sparks great interest in the fields such as climate science and economics. We propose to estimate confounded causal links of time series using Sequential Causal Effect Variational Autoencoder (SCEVAE) while applying Knockoff interventions. Knockoff variables have the same distribution as the originals and preserve the correlation to other variables. This allows for counterfactuals that are more faithful to the observational distribution. We show the advantage of Knockoff interventions by applying SCEVAE to synthetic datasets with both linear and nonlinear causal links. Moreover, we apply SCEVAE with Knockoffs to real aerosol-cloud-climate observational time series data. We compare our results on synthetic data to those of a time series deconfounding method both with and without estimated confounders. We show that our method outperforms this benchmark by comparing both methods to the ground truth. For the real data analysis, we rely on expert knowledge of causal links and demonstrate how using suitable proxy variables improves the causal link estimation in the presence of hidden confounders.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源