论文标题
基于信息的深度基于信息的学习,以保护隐私智能电表数据发布
Deep Directed Information-Based Learning for Privacy-Preserving Smart Meter Data Release
论文作者
论文摘要
数据收集的爆炸激增引起了用户严重的隐私问题,因为共享数据也可能揭示了敏感信息。保护隐私机制的主要目标是防止恶意的第三方推断敏感信息,同时保持共享数据有用。在本文中,我们特别是在时间序列数据和智能电表(SMS)功耗测量的背景下研究此问题。尽管私人和释放变量之间的共同信息(MI)已被用作常见的信息理论隐私措施,但它未能捕获功耗时间序列数据中存在的因果时间依赖关系。为了克服这一局限性,我们将定向信息(DI)介绍为在考虑的环境中对隐私的更有意义的衡量,并提出了新的损失函数。然后使用对抗性框架进行优化,其中两个复发性神经网络(RNN)(称为发行器和对手)接受了相反的目标训练。在最坏的情况下,我们对SMS测量值的真实数据集的实证研究,在这种情况下,攻击者可以访问发布者使用的所有培训数据集,验证该提出的方法并显示隐私和公用事业之间的现有权衡。
The explosion of data collection has raised serious privacy concerns in users due to the possibility that sharing data may also reveal sensitive information. The main goal of a privacy-preserving mechanism is to prevent a malicious third party from inferring sensitive information while keeping the shared data useful. In this paper, we study this problem in the context of time series data and smart meters (SMs) power consumption measurements in particular. Although Mutual Information (MI) between private and released variables has been used as a common information-theoretic privacy measure, it fails to capture the causal time dependencies present in the power consumption time series data. To overcome this limitation, we introduce the Directed Information (DI) as a more meaningful measure of privacy in the considered setting and propose a novel loss function. The optimization is then performed using an adversarial framework where two Recurrent Neural Networks (RNNs), referred to as the releaser and the adversary, are trained with opposite goals. Our empirical studies on real-world data sets from SMs measurements in the worst-case scenario where an attacker has access to all the training data set used by the releaser, validate the proposed method and show the existing trade-offs between privacy and utility.