DP $^2 $ -VAE：差异私人训练的变异自动编码器

论文标题

DP $^2 $ -VAE：差异私人训练的变异自动编码器

DP$^2$-VAE: Differentially Private Pre-trained Variational Autoencoders

论文作者

Jiang, Dihong, Zhang, Guojun, Karami, Mahdi, Chen, Xi, Shao, Yunfeng, Yu, Yaoliang

论文摘要

现代机器学习系统在大型数据集上培训时取得了巨大的成功。但是，这些数据集通常包含敏感信息（例如医疗记录，面部图像），导致严重的隐私问题。差异化私有生成模型（DPGM）通过生成私有化的敏感数据来避免此类隐私问题的解决方案。与其他差异私人（DP）学习者类似，DPGM的主要挑战也是如何在效用和隐私之间取得微妙的平衡。我们提出了DP $^2 $ -VAE，这是一种新型自动编码器（VAE）的新型培训机制，具有可证明的DP保证，并通过\ emph {pre-emph {pre-emph {pre-emph {prec-emph {pret-emph {pret-emph}。在相同的DP约束下，DP $^2 $ -VAE最大程度地减少了训练过程中的扰动噪声，从而改善了实用性。 DP $^2 $ -VAE非常灵活，并且很容易适合许多其他VAE变体。从理论上讲，我们研究了预处理对私人数据的影响。从经验上讲，我们在图像数据集上进行了广泛的实验，以说明我们在各种隐私预算和评估指标下对基准的优越性。

Modern machine learning systems achieve great success when trained on large datasets. However, these datasets usually contain sensitive information (e.g. medical records, face images), leading to serious privacy concerns. Differentially private generative models (DPGMs) emerge as a solution to circumvent such privacy concerns by generating privatized sensitive data. Similar to other differentially private (DP) learners, the major challenge for DPGM is also how to achieve a subtle balance between utility and privacy. We propose DP$^2$-VAE, a novel training mechanism for variational autoencoders (VAE) with provable DP guarantees and improved utility via \emph{pre-training on private data}. Under the same DP constraints, DP$^2$-VAE minimizes the perturbation noise during training, and hence improves utility. DP$^2$-VAE is very flexible and easily amenable to many other VAE variants. Theoretically, we study the effect of pretraining on private data. Empirically, we conduct extensive experiments on image datasets to illustrate our superiority over baselines under various privacy budgets and evaluation metrics.

下载PDF全文

下载文献需遵守相关版权规定

论文标题