论文标题

用于评估机器阅读理解的越南数据集

A Vietnamese Dataset for Evaluating Machine Reading Comprehension

论文作者

Van Nguyen, Kiet, Nguyen, Duc-Vu, Nguyen, Anh Gia-Tuan, Nguyen, Ngan Luu-Thuy

论文摘要

超过9700万人在世界上说越南语是他们的母语。但是,关于越南人的机器阅读理解理解(MRC)的研究很少,这是理解文本和回答与之相关的问题的任务。由于缺乏越南语的基准数据集,我们提出了越南问题回答数据集(UIT-Viquad),这是一种用于低资源语言的新数据集作为越南人评估MRC模型的越南语。该数据集包括23,000多个人类生成的问答对,基于5,109条来自Wikipedia的越南文章的5,109段。特别是,我们为越南MRC提出了一个新的数据集创建过程。我们的深入分析表明,我们的数据集需要超越单词匹配等简单推理的能力,并且需要单句子和多句子推断。此外,我们对英语和中文最先进的MRC方法进行了实验,作为UIT-Viquad上的第一个实验模型。我们还估计了数据集上的人类性能,并将其与强大的机器学习模型的实验结果进行了比较。结果,人类绩效与数据集上最佳模型性能之间的实质差异表明,在未来的研究中可以对UIT-Viquad进行改进。我们的数据集可以在我们的网站上免费获得,以鼓励研究社区克服越南MRC的挑战。

Over 97 million people speak Vietnamese as their native language in the world. However, there are few research studies on machine reading comprehension (MRC) for Vietnamese, the task of understanding a text and answering questions related to it. Due to the lack of benchmark datasets for Vietnamese, we present the Vietnamese Question Answering Dataset (UIT-ViQuAD), a new dataset for the low-resource language as Vietnamese to evaluate MRC models. This dataset comprises over 23,000 human-generated question-answer pairs based on 5,109 passages of 174 Vietnamese articles from Wikipedia. In particular, we propose a new process of dataset creation for Vietnamese MRC. Our in-depth analyses illustrate that our dataset requires abilities beyond simple reasoning like word matching and demands single-sentence and multiple-sentence inferences. Besides, we conduct experiments on state-of-the-art MRC methods for English and Chinese as the first experimental models on UIT-ViQuAD. We also estimate human performance on the dataset and compare it to the experimental results of powerful machine learning models. As a result, the substantial differences between human performance and the best model performance on the dataset indicate that improvements can be made on UIT-ViQuAD in future research. Our dataset is freely available on our website to encourage the research community to overcome challenges in Vietnamese MRC.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源