私人机器学习的多段矩阵分解机制

论文标题

私人机器学习的多段矩阵分解机制

Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning

论文作者

Choquette-Choo, Christopher A., McMahan, H. Brendan, Rush, Keith, Thakurta, Abhradeep

论文摘要

我们在数据集上介绍了基于梯度的机器学习（ML）的新差异私人（DP）机制，从而实质上改善了可实现的隐私性 - 实用性折衷方案。我们将DP机制的问题正式化，用于自适应流，并引入了在线矩阵分解DP机制的非平凡扩展到我们的设置。这包括建立必要的理论，以进行敏感性计算和有效的最佳矩阵计算。对于某些应用，例如$> \！\！ 10,000 $ SGD步骤，应用这些最佳技术在计算上变得昂贵。因此，我们设计了一种有效的基于傅立叶转变的机制，只有较小的实用性损失。对图像分类的示例级DP和用于语言建模的用户级DP的广泛经验评估都证明了对所有先前方法（包括广泛使用的DP-SGD）的实质性改进。尽管我们的主要应用程序是ML，但我们的主要DP结果适用于任意线性查询，因此可能具有更广泛的适用性。

We introduce new differentially private (DP) mechanisms for gradient-based machine learning (ML) with multiple passes (epochs) over a dataset, substantially improving the achievable privacy-utility-computation tradeoffs. We formalize the problem of DP mechanisms for adaptive streams with multiple participations and introduce a non-trivial extension of online matrix factorization DP mechanisms to our setting. This includes establishing the necessary theory for sensitivity calculations and efficient computation of optimal matrices. For some applications like $>\!\! 10,000$ SGD steps, applying these optimal techniques becomes computationally expensive. We thus design an efficient Fourier-transform-based mechanism with only a minor utility loss. Extensive empirical evaluation on both example-level DP for image classification and user-level DP for language modeling demonstrate substantial improvements over all previous methods, including the widely-used DP-SGD . Though our primary application is to ML, our main DP results are applicable to arbitrary linear queries and hence may have much broader applicability.

下载PDF全文

下载文献需遵守相关版权规定

论文标题