玩得更好：与多个分类器一起进行对抗学习的重复游戏

论文标题

玩得更好：与多个分类器一起进行对抗学习的重复游戏

Playing to Learn Better: Repeated Games for Adversarial Learning with Multiple Classifiers

论文作者

Dasgupta, Prithviraj, Collins, Joseph B., McCarrick, Michael

论文摘要

我们在对抗性学习环境中考虑了机器学习算法（称为学习者）的预测问题。学习者的任务是正确预测传递给它的数据类别。但是，除了包含干净数据的查询外，学习者还可以从对手那里收到恶意或对抗性查询。对手的目的是通过发送对对抗性的查询来逃避学习者的预测机制，从而导致学习者错误的班级预测，而学习者的目标是减少对这些对抗性查询的不正确预测，而不会降低清洁查询的预测质量。我们提出了一种基于游戏理论的技术，称为重复的贝叶斯顺序游戏，其中学习者使用自我游戏反复与对手模型进行反复互动，以确定对抗性与清洁查询的分布。然后，它从一组预训练的分类器中选择一个分类器，以平衡正确预测查询的可能性，并降低使用分类器的成本。我们已经使用具有深层神经网络分类器的干净和对抗性文本数据评估了我们提出的技术，并表明学习者可以选择与查询类型（清洁或对抗性）相称的合适分类器，同时仍意识到使用分类器的成本。

We consider the problem of prediction by a machine learning algorithm, called learner, within an adversarial learning setting. The learner's task is to correctly predict the class of data passed to it as a query. However, along with queries containing clean data, the learner could also receive malicious or adversarial queries from an adversary. The objective of the adversary is to evade the learner's prediction mechanism by sending adversarial queries that result in erroneous class prediction by the learner, while the learner's objective is to reduce the incorrect prediction of these adversarial queries without degrading the prediction quality of clean queries. We propose a game theory-based technique called a Repeated Bayesian Sequential Game where the learner interacts repeatedly with a model of the adversary using self play to determine the distribution of adversarial versus clean queries. It then strategically selects a classifier from a set of pre-trained classifiers that balances the likelihood of correct prediction for the query along with reducing the costs to use the classifier. We have evaluated our proposed technique using clean and adversarial text data with deep neural network-based classifiers and shown that the learner can select an appropriate classifier that is commensurate with the query type (clean or adversarial) while remaining aware of the cost to use the classifier.

下载PDF全文

下载文献需遵守相关版权规定

论文标题