论文标题
利用大规模多媒体数据集来完善内容适度模型
Leveraging Large-scale Multimedia Datasets to Refine Content Moderation Models
论文作者
论文摘要
在线用户生成的内容的庞大数量使内容审核技术至关重要,以保护数字平台受众免受可能引起焦虑,忧虑或关注的内容。尽管为解决这个问题而开发自动解决方案的努力,但由于缺乏适当的特定任务培训数据,创建准确的模型仍然具有挑战性。手动注释此类数据是一个高度要求的过程,可能会严重影响注释者的情绪健康与后一个限制直接相关。在本文中,我们提出了CM详细框架,该框架利用大规模的多媒体数据集使用硬性示例来自动扩展初始培训数据集,以完善内容适度模型,同时大大减少了人类注释者的参与。我们将我们的方法应用于针对收集数据时观察到的不同挑战的两种模型适应策略,即缺乏(i)特定于任务的负数据或(ii)正和负数据。此外,我们引入了应用于数据收集过程的多样性标准,该标准进一步增强了精制模型的概括性能。与ART的状态相比,在基准数据集中评估了所提出的方法(NSFW),并在基准数据集中评估了1.32%和1.94%的准确性提高的内容检测任务。最后,它大大减少了人类的参与,因为在不干扰内容的情况下,有92.54%的数据自动注释,而NSFW任务不需要人类干预。
The sheer volume of online user-generated content has rendered content moderation technologies essential in order to protect digital platform audiences from content that may cause anxiety, worry, or concern. Despite the efforts towards developing automated solutions to tackle this problem, creating accurate models remains challenging due to the lack of adequate task-specific training data. The fact that manually annotating such data is a highly demanding procedure that could severely affect the annotators' emotional well-being is directly related to the latter limitation. In this paper, we propose the CM-Refinery framework that leverages large-scale multimedia datasets to automatically extend initial training datasets with hard examples that can refine content moderation models, while significantly reducing the involvement of human annotators. We apply our method on two model adaptation strategies designed with respect to the different challenges observed while collecting data, i.e. lack of (i) task-specific negative data or (ii) both positive and negative data. Additionally, we introduce a diversity criterion applied to the data collection process that further enhances the generalization performance of the refined models. The proposed method is evaluated on the Not Safe for Work (NSFW) and disturbing content detection tasks on benchmark datasets achieving 1.32% and 1.94% accuracy improvements compared to the state of the art, respectively. Finally, it significantly reduces human involvement, as 92.54% of data are automatically annotated in case of disturbing content while no human intervention is required for the NSFW task.