理性指导的几个射击分类以检测滥用语言

论文标题

理性指导的几个射击分类以检测滥用语言

Rationale-Guided Few-Shot Classification to Detect Abusive Language

论文作者

Saha, Punyajoy, Sheth, Divyanshu, Kedia, Kushal, Mathew, Binny, Mukherjee, Animesh

论文摘要

虐待语言是在线社交媒体中的一个问题。过去关于检测滥用语言的研究涵盖了不同的平台，语言，人口统计等。但是，使用这些数据集训练的模型在跨域评估设置中表现不佳。为了克服这一点，一种常见的策略是使用目标域中的一些样本来训练模型，以在该域中获得更好的性能（跨域少数训练）。但是，这可能会导致模型过度拟合这些样品的人工制品。一个引人注目的解决方案可能是指导模型迈向理由，即证明文本标签合理的文本跨度。已经发现该方法可以在各种NLP任务中改善内域设置中的模型性能。在本文中，我们建议滥用语言检测的RGF（理性引导几乎没有射击分类）。我们首先构建多任务学习设置，以共同学习理由，目标和标签，并在培训仅培训的基本原理分类器上，在理由检测任务上发现6％宏F1的显着改善。我们介绍了两个基于BERT的构造架构（RGFS模型），并在五个不同的滥用语言数据集上评估了我们的系统，发现在几个弹药分类设置中，基于RGFS的基于RGFS的模型在宏F1分数中的表现优于7％，并且在其他源域上进行了竞争性的模型。此外，基于RGFS的模型就合理性而优于基于石灰/塑形的方法，并且在忠诚方面的表现接近。

Abusive language is a concerning problem in online social media. Past research on detecting abusive language covers different platforms, languages, demographies, etc. However, models trained using these datasets do not perform well in cross-domain evaluation settings. To overcome this, a common strategy is to use a few samples from the target domain to train models to get better performance in that domain (cross-domain few-shot training). However, this might cause the models to overfit the artefacts of those samples. A compelling solution could be to guide the models toward rationales, i.e., spans of text that justify the text's label. This method has been found to improve model performance in the in-domain setting across various NLP tasks. In this paper, we propose RGFS (Rationale-Guided Few-Shot Classification) for abusive language detection. We first build a multitask learning setup to jointly learn rationales, targets, and labels, and find a significant improvement of 6% macro F1 on the rationale detection task over training solely rationale classifiers. We introduce two rationale-integrated BERT-based architectures (the RGFS models) and evaluate our systems over five different abusive language datasets, finding that in the few-shot classification setting, RGFS-based models outperform baseline models by about 7% in macro F1 scores and perform competitively to models finetuned on other source domains. Furthermore, RGFS-based models outperform LIME/SHAP-based approaches in terms of plausibility and are close in performance in terms of faithfulness.

下载PDF全文

下载文献需遵守相关版权规定

论文标题