通过向后错误分析对联邦学习的半目标模型中毒攻击

论文标题

通过向后错误分析对联邦学习的半目标模型中毒攻击

Semi-Targeted Model Poisoning Attack on Federated Learning via Backward Error Analysis

论文作者

Sun, Yuwei, Ochiai, Hideya, Sakuma, Jun

论文摘要

模型中毒对联邦学习（FL）的攻击通过损害边缘模型在整个系统中侵入，从而导致机器学习模型故障。这种折衷的模型被篡改以执行对抗性的行为。特别是，我们考虑了一个半目标的情况，在该情况下，源类是预先确定的，但是目标类别不是。目的是使全局分类器错误分类源类的数据。尽管已经采用了诸如标签翻转之类的方法将中毒的参数注入FL，但已表明他们的性能通常随着应用不同的目标类别而变化。通常，在转移到另一个目标类别时，攻击会变得降低。为了克服这一挑战，我们提出了攻击距离感知攻击（ADA），以通过在功能空间中找到优化的目标类来增强中毒攻击。此外，我们研究了一个更具挑战性的情况，在这种情况下，对手对客户数据的先验知识有限。为了解决此问题，ADA根据向后误差分析，从共享模型参数中推论不同类别的不同类之间的成对距离。我们通过改变了三个不同图像分类任务中攻击频率的因素，对ADA进行了广泛的经验评估。结果，在最具挑战性的情况下，ADA成功将攻击性能提高了1.8倍，攻击频率为0.01。

Model poisoning attacks on federated learning (FL) intrude in the entire system via compromising an edge model, resulting in malfunctioning of machine learning models. Such compromised models are tampered with to perform adversary-desired behaviors. In particular, we considered a semi-targeted situation where the source class is predetermined however the target class is not. The goal is to cause the global classifier to misclassify data of the source class. Though approaches such as label flipping have been adopted to inject poisoned parameters into FL, it has been shown that their performances are usually class-sensitive varying with different target classes applied. Typically, an attack can become less effective when shifting to a different target class. To overcome this challenge, we propose the Attacking Distance-aware Attack (ADA) to enhance a poisoning attack by finding the optimized target class in the feature space. Moreover, we studied a more challenging situation where an adversary had limited prior knowledge about a client's data. To tackle this problem, ADA deduces pair-wise distances between different classes in the latent feature space from shared model parameters based on the backward error analysis. We performed extensive empirical evaluations on ADA by varying the factor of attacking frequency in three different image classification tasks. As a result, ADA succeeded in increasing the attack performance by 1.8 times in the most challenging case with an attacking frequency of 0.01.

下载PDF全文

下载文献需遵守相关版权规定

论文标题