概述2020年大火中乌尔都语的假新闻检测的共同任务

论文标题

概述2020年大火中乌尔都语的假新闻检测的共同任务

Overview of the Shared Task on Fake News Detection in Urdu at FIRE 2020

论文作者

Amjad, Maaz, Sidorov, Grigori, Zhila, Alisa, Gelbukh, Alexander, Rosso, Paolo

论文摘要

本概述论文描述了乌尔都语语言中的假新闻检测的第一个共享任务。该任务是作为二进制分类任务的，在该任务中，目标是区分真实新闻和虚假新闻。我们提供了一个数据集，分为900个注释的新闻文章，用于培训，并进行了400篇新闻文章进行测试。该数据集包含五个领域的新闻：（i）健康，（ii）体育，（iii）Showbiz，（iv）技术和（v）业务。来自6个不同国家（印度，中国，埃及，德国，巴基斯坦和英国）的42个团队登记了这项任务。 9个团队提交了他们的实验结果。参与者使用了各种机器学习方法，从基于功能的传统机器学习到神经网络技术。最佳性能系统的F得分值为0.90，表明基于BERT的方法优于其他机器学习技术

This overview paper describes the first shared task on fake news detection in Urdu language. The task was posed as a binary classification task, in which the goal is to differentiate between real and fake news. We provided a dataset divided into 900 annotated news articles for training and 400 news articles for testing. The dataset contained news in five domains: (i) Health, (ii) Sports, (iii) Showbiz, (iv) Technology, and (v) Business. 42 teams from 6 different countries (India, China, Egypt, Germany, Pakistan, and the UK) registered for the task. 9 teams submitted their experimental results. The participants used various machine learning methods ranging from feature-based traditional machine learning to neural networks techniques. The best performing system achieved an F-score value of 0.90, showing that the BERT-based approach outperforms other machine learning techniques

下载PDF全文

下载文献需遵守相关版权规定

论文标题