论文标题
NLP的差异私人表示:正式保证和关于隐私和公平性的实证研究
Differentially Private Representation for NLP: Formal Guarantee and An Empirical Study on Privacy and Fairness
论文作者
论文摘要
已经证明,通过深层模型学到的隐藏表示形式可以编码输入的私人信息,因此可以利用以合理的准确性来恢复此类信息。为了解决这个问题,我们提出了一种称为差异化神经表示(DPNR)的新颖方法,以保留提取的表示从文本中保留的隐私。 DPNR利用差异隐私(DP)提供正式的隐私保证。此外,我们表明通过辍学掩盖单词可以进一步增强隐私。为了维持学习的表示的效用,我们将DP-noisy表示形式集成到一个强大的训练过程中,以得出强大的目标模型,这也有助于模型对各种人口统计学变量的模型公平。在各种参数设置下基于基准数据集的实验结果表明,DPNR在很大程度上会减少隐私泄漏而不显着牺牲主要任务绩效。
It has been demonstrated that hidden representation learned by a deep model can encode private information of the input, hence can be exploited to recover such information with reasonable accuracy. To address this issue, we propose a novel approach called Differentially Private Neural Representation (DPNR) to preserve the privacy of the extracted representation from text. DPNR utilises Differential Privacy (DP) to provide a formal privacy guarantee. Further, we show that masking words via dropout can further enhance privacy. To maintain utility of the learned representation, we integrate DP-noisy representation into a robust training process to derive a robust target model, which also helps for model fairness over various demographic variables. Experimental results on benchmark datasets under various parameter settings demonstrate that DPNR largely reduces privacy leakage without significantly sacrificing the main task performance.