论文标题
私人机器学习的多段矩阵分解机制
Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning
论文作者
论文摘要
我们在数据集上介绍了基于梯度的机器学习(ML)的新差异私人(DP)机制,从而实质上改善了可实现的隐私性 - 实用性折衷方案。我们将DP机制的问题正式化,用于自适应流,并引入了在线矩阵分解DP机制的非平凡扩展到我们的设置。这包括建立必要的理论,以进行敏感性计算和有效的最佳矩阵计算。对于某些应用,例如$> \!\! 10,000 $ SGD步骤,应用这些最佳技术在计算上变得昂贵。因此,我们设计了一种有效的基于傅立叶转变的机制,只有较小的实用性损失。对图像分类的示例级DP和用于语言建模的用户级DP的广泛经验评估都证明了对所有先前方法(包括广泛使用的DP-SGD)的实质性改进。尽管我们的主要应用程序是ML,但我们的主要DP结果适用于任意线性查询,因此可能具有更广泛的适用性。
We introduce new differentially private (DP) mechanisms for gradient-based machine learning (ML) with multiple passes (epochs) over a dataset, substantially improving the achievable privacy-utility-computation tradeoffs. We formalize the problem of DP mechanisms for adaptive streams with multiple participations and introduce a non-trivial extension of online matrix factorization DP mechanisms to our setting. This includes establishing the necessary theory for sensitivity calculations and efficient computation of optimal matrices. For some applications like $>\!\! 10,000$ SGD steps, applying these optimal techniques becomes computationally expensive. We thus design an efficient Fourier-transform-based mechanism with only a minor utility loss. Extensive empirical evaluation on both example-level DP for image classification and user-level DP for language modeling demonstrate substantial improvements over all previous methods, including the widely-used DP-SGD . Though our primary application is to ML, our main DP results are applicable to arbitrary linear queries and hence may have much broader applicability.