论文标题
干预有效算法用于近似因果图
Intervention Efficient Algorithms for Approximate Learning of Causal Graphs
论文作者
论文摘要
我们研究了在潜在的存在下学习一组观察到的变量之间因果关系的问题,同时最大程度地减少了观察到的变量的干预成本。我们假设在观察到的变量上访问了无向图$ g $,其边缘代表所有直接的因果关系,或者不太限制因果关系的超集(例如,通过有条件的独立性测试或域专家识别)。我们的目标是通过最低成本的干预措施以$ g $恢复所有因果关系或祖先关系的指示。众所周知,为任意图$ g $构建确切的最低成本干预设置是NP-HARD。我们进一步认为,以近似图形着色的硬度为条件,没有多项式时间算法可以更好地达到近似因素,而近似算法比$θ(\ log n)$更好,其中$ n $是$ g $中观察到的变量的数量。为了克服这一限制,我们引入了一个双标准近似目标,该目标使我们可以在$ g $中恢复除$εn^2 $边的所有指示,对于某些指定的错误参数$ε> 0 $。在这个轻松的目标下,我们给出多项式时间算法,以在最佳恒定因素的少量因素内实现干预成本。我们的算法结合了有效干预设计和低成本分离设置系统的设计,以及有关图形属性测试的想法。
We study the problem of learning the causal relationships between a set of observed variables in the presence of latents, while minimizing the cost of interventions on the observed variables. We assume access to an undirected graph $G$ on the observed variables whose edges represent either all direct causal relationships or, less restrictively, a superset of causal relationships (identified, e.g., via conditional independence tests or a domain expert). Our goal is to recover the directions of all causal or ancestral relations in $G$, via a minimum cost set of interventions. It is known that constructing an exact minimum cost intervention set for an arbitrary graph $G$ is NP-hard. We further argue that, conditioned on the hardness of approximate graph coloring, no polynomial time algorithm can achieve an approximation factor better than $Θ(\log n)$, where $n$ is the number of observed variables in $G$. To overcome this limitation, we introduce a bi-criteria approximation goal that lets us recover the directions of all but $εn^2$ edges in $G$, for some specified error parameter $ε> 0$. Under this relaxed goal, we give polynomial time algorithms that achieve intervention cost within a small constant factor of the optimal. Our algorithms combine work on efficient intervention design and the design of low-cost separating set systems, with ideas from the literature on graph property testing.