论文标题
在联邦学习中被遗忘的权利:有效实现,快速培训
The Right to be Forgotten in Federated Learning: An Efficient Realization with Rapid Retraining
论文作者
论文摘要
在机器学习中,\ textit {被遗忘的权利}的出现诞生了一个名为\ textit {Machine unrearning}的范式,这使数据持有人能够从受过训练的模型中主动删除其数据。现有的机器学习技术集中于集中式培训,在这些培训中,服务器必须访问所有持有人的培训数据。当完全访问所有培训数据时,在如何实现未学习时,它仍然在很大程度上没有得到充实的态度。一个值得注意的例子是联合学习(FL),每个参与数据持有人在本地训练,而无需将其培训数据分享到中央服务器。在本文中,我们研究了FL系统中的机器学习问题。我们从FL中的未学习问题的正式定义开始,并提出了一种快速的再培训方法,以完全删除训练有素的FL模型的数据样本。最终的设计使数据持有人可以在当地保持培训数据的同时有效地共同执行未学习过程。我们的正式收敛性和复杂性分析表明,我们的设计可以以高效率保护模型效用。对四个现实世界数据集的广泛评估说明了我们提出的实现的有效性和绩效。
In Machine Learning, the emergence of \textit{the right to be forgotten} gave birth to a paradigm named \textit{machine unlearning}, which enables data holders to proactively erase their data from a trained model. Existing machine unlearning techniques focus on centralized training, where access to all holders' training data is a must for the server to conduct the unlearning process. It remains largely underexplored about how to achieve unlearning when full access to all training data becomes unavailable. One noteworthy example is Federated Learning (FL), where each participating data holder trains locally, without sharing their training data to the central server. In this paper, we investigate the problem of machine unlearning in FL systems. We start with a formal definition of the unlearning problem in FL and propose a rapid retraining approach to fully erase data samples from a trained FL model. The resulting design allows data holders to jointly conduct the unlearning process efficiently while keeping their training data locally. Our formal convergence and complexity analysis demonstrate that our design can preserve model utility with high efficiency. Extensive evaluations on four real-world datasets illustrate the effectiveness and performance of our proposed realization.