论文标题
Laplacian半监督学习的收敛速率,标签率低
Rates of Convergence for Laplacian Semi-Supervised Learning with Low Labeling Rates
论文作者
论文摘要
我们研究基于图形的Laplacian半监督学习率以低标记率。 Laplacian学习使用图上的谐波扩展来传播标签。在非常低的标签速率下,拉普拉斯学习变得退化,解决方案在每个标记的数据点上大致恒定。先前的工作表明,当标记的数据点的数量是有限的,而未标记的数据点的数量往往无穷大时,就会发生这种退化。在这项工作中,我们允许标记的数据点的数量随着标签的数量而增长到无穷大。 Our results show that for a random geometric graph with length scale $\varepsilon>0$ and labeling rate $β>0$, if $β\ll\varepsilon^2$ then the solution becomes degenerate and spikes form, and if $β\gg \varepsilon^2$ then Laplacian learning is well-posed and consistent with a continuum Laplace equation.此外,在设置良好的设置中,我们证明了$ o(\varepsilonβ{ - 1/2})$的定量误差估计值,用于离散问题和连续PDE的解决方案之间的差异,直至对数因素。我们还研究了$ p $ -laplacian的正则化,并在$β\ ll \ varepsilon^p $时显示出相同的退化结果。我们适应性良好的结果的证据使用了对拉普拉斯学习和PDE论证的随机行走解释,而不适性结果的证明使用了$γ$ - convergence工具,从变化的计算中。我们还提供了有关合成和实际数据的数值结果,以说明我们的结果。
We study graph-based Laplacian semi-supervised learning at low labeling rates. Laplacian learning uses harmonic extension on a graph to propagate labels. At very low label rates, Laplacian learning becomes degenerate and the solution is roughly constant with spikes at each labeled data point. Previous work has shown that this degeneracy occurs when the number of labeled data points is finite while the number of unlabeled data points tends to infinity. In this work we allow the number of labeled data points to grow to infinity with the number of labels. Our results show that for a random geometric graph with length scale $\varepsilon>0$ and labeling rate $β>0$, if $β\ll\varepsilon^2$ then the solution becomes degenerate and spikes form, and if $β\gg \varepsilon^2$ then Laplacian learning is well-posed and consistent with a continuum Laplace equation. Furthermore, in the well-posed setting we prove quantitative error estimates of $O(\varepsilonβ^{-1/2})$ for the difference between the solutions of the discrete problem and continuum PDE, up to logarithmic factors. We also study $p$-Laplacian regularization and show the same degeneracy result when $β\ll \varepsilon^p$. The proofs of our well-posedness results use the random walk interpretation of Laplacian learning and PDE arguments, while the proofs of the ill-posedness results use $Γ$-convergence tools from the calculus of variations. We also present numerical results on synthetic and real data to illustrate our results.