论文标题
苏格兰语:知识密集语言任务的基准
KILT: a Benchmark for Knowledge Intensive Language Tasks
论文作者
论文摘要
诸如开放域问题回答,事实检查,插槽填充和实体链接等挑战性问题需要访问大型外部知识源。尽管某些模型在单个任务上表现良好,但开发通用模型很困难,因为除了专用的基础架构外,每个任务都可能需要定制知识源的计算昂贵索引。为了催化对大量文本资源中特定信息的模型的研究,我们为知识密集型语言任务(kilt)提供了一个基准。苏格兰短裙中的所有任务均基于Wikipedia的同一快照,通过重新使用组件来减少工程周转,并加速对任务无关的内存架构的研究。我们测试了特定于任务的和一般的基线,除了模型提供出处的能力外,还评估了下游性能。我们发现,共享密集的向量索引以及SEQ2SEQ模型是一个强大的基线,优于量身定制的方法,用于检查事实检查,开放域的问题答案和对话,并通过产生毫无启动的文本来对实体链接和插槽填充的实体链接和插槽填充产生竞争性结果。苏格兰语数据和代码可从https://github.com/facebookresearch/kilt获得。
Challenging problems such as open-domain question answering, fact checking, slot filling and entity linking require access to large, external knowledge sources. While some models do well on individual tasks, developing general models is difficult as each task might require computationally expensive indexing of custom knowledge sources, in addition to dedicated infrastructure. To catalyze research on models that condition on specific information in large textual resources, we present a benchmark for knowledge-intensive language tasks (KILT). All tasks in KILT are grounded in the same snapshot of Wikipedia, reducing engineering turnaround through the re-use of components, as well as accelerating research into task-agnostic memory architectures. We test both task-specific and general baselines, evaluating downstream performance in addition to the ability of the models to provide provenance. We find that a shared dense vector index coupled with a seq2seq model is a strong baseline, outperforming more tailor-made approaches for fact checking, open-domain question answering and dialogue, and yielding competitive results on entity linking and slot filling, by generating disambiguated text. KILT data and code are available at https://github.com/facebookresearch/KILT.