论文标题
分裂学习中的私人标签保护
Differentially Private Label Protection in Split Learning
论文作者
论文摘要
Split学习是一个分布式培训框架,允许多方通过垂直分区的数据(通过属性分区)共同训练机器学习模型。这个想法是,仅在各方之间共享中间计算结果,而不是私人功能和标签,以使原始培训数据保持私密。尽管如此,最近的作品表明,分裂学习的明文实施遭受了严重的隐私风险,半honest对手可以轻松地重建标签。在这项工作中,我们建议\ textsf {tpsl}(转录本私人拆分学习),这是一个基于通用的梯度扰动的分裂学习框架,可提供可证明的差异隐私保证。不仅在模型权重,而且在分布式计算设置中的传达消息上执行差异隐私。我们在大规模现实世界数据集上的实验证明了\ textsf {tpsl}对标签泄漏攻击的鲁棒性和有效性。我们还发现\ textsf {tpsl}的公用事业私人关系折衷比基线更好。
Split learning is a distributed training framework that allows multiple parties to jointly train a machine learning model over vertically partitioned data (partitioned by attributes). The idea is that only intermediate computation results, rather than private features and labels, are shared between parties so that raw training data remains private. Nevertheless, recent works showed that the plaintext implementation of split learning suffers from severe privacy risks that a semi-honest adversary can easily reconstruct labels. In this work, we propose \textsf{TPSL} (Transcript Private Split Learning), a generic gradient perturbation based split learning framework that provides provable differential privacy guarantee. Differential privacy is enforced on not only the model weights, but also the communicated messages in the distributed computation setting. Our experiments on large-scale real-world datasets demonstrate the robustness and effectiveness of \textsf{TPSL} against label leakage attacks. We also find that \textsf{TPSL} have a better utility-privacy trade-off than baselines.