Chow-liu对树结构分布的近乎最佳学习

论文标题

Chow-liu对树结构分布的近乎最佳学习

Near-Optimal Learning of Tree-Structured Distributions by Chow-Liu

论文作者

Bhattacharyya, Arnab, Gayen, Sutanu, Price, Eric, Vinodchandran, N. V.

论文摘要

我们为经典的Chow-Liu算法（IEEE TRANS。 For a distribution $P$ on $Σ^n$ and a tree $T$ on $n$ nodes, we say $T$ is an $\varepsilon$-approximate tree for $P$ if there is a $T$-structured distribution $Q$ such that $D(P\;||\;Q)$ is at most $\varepsilon$ more than the best possible tree-structured distribution for $P$.我们表明，如果$ p $本身是树的结构，则使用插件估算器的Chow-liu算法使用$ \ widetilde {o}（|σ|^3 n \ varepsilon^{ - 1} { - 1}）$ i.i.d.相比之下，对于一般的$ p $（可能不是树的结构化），$ω（n^2 \ varepsilon^{ - 2}）$样品对于找到$ \ varepsilon $ - approximate树是必要的。我们的上限是基于一项新的条件独立测试仪，该测试仪解决了Canonne，Diakonikolas，Kane和Stewart〜（Stoc，2018年）提出的一个空旷的问题：我们证明，对于三个随机变量，$ x，y，z $每个$ x $ avanty $ i（x; y \ y \ y \ y \ y \ y \ y \ geq v act co q y v verec cop y varrons都可能是$ x，y，z $ $ \ widetilde {o}（|σ|^3/\ varepsilon）$样本。最后，我们表明，对于特定的树$ t $，带有$ \ widetilde {o}（|σ|^2n \ varepsilon^{ - 1}）$ samble $ p $ over $σ^n $上的$ p $，可以有效地学习通过在kl divergence中使用添加1个估算的node nondotor的$ t $ t $结构的$ t $ t $结构的分配。

We provide finite sample guarantees for the classical Chow-Liu algorithm (IEEE Trans.~Inform.~Theory, 1968) to learn a tree-structured graphical model of a distribution. For a distribution $P$ on $Σ^n$ and a tree $T$ on $n$ nodes, we say $T$ is an $\varepsilon$-approximate tree for $P$ if there is a $T$-structured distribution $Q$ such that $D(P\;||\;Q)$ is at most $\varepsilon$ more than the best possible tree-structured distribution for $P$. We show that if $P$ itself is tree-structured, then the Chow-Liu algorithm with the plug-in estimator for mutual information with $\widetilde{O}(|Σ|^3 n\varepsilon^{-1})$ i.i.d.~samples outputs an $\varepsilon$-approximate tree for $P$ with constant probability. In contrast, for a general $P$ (which may not be tree-structured), $Ω(n^2\varepsilon^{-2})$ samples are necessary to find an $\varepsilon$-approximate tree. Our upper bound is based on a new conditional independence tester that addresses an open problem posed by Canonne, Diakonikolas, Kane, and Stewart~(STOC, 2018): we prove that for three random variables $X,Y,Z$ each over $Σ$, testing if $I(X; Y \mid Z)$ is $0$ or $\geq \varepsilon$ is possible with $\widetilde{O}(|Σ|^3/\varepsilon)$ samples. Finally, we show that for a specific tree $T$, with $\widetilde{O} (|Σ|^2n\varepsilon^{-1})$ samples from a distribution $P$ over $Σ^n$, one can efficiently learn the closest $T$-structured distribution in KL divergence by applying the add-1 estimator at each node.

下载PDF全文

下载文献需遵守相关版权规定

论文标题