Idiapers @ Causal News Corpus 2022：通过及时基于迅速的几种方法的有效因果关系识别

论文标题

Idiapers @ Causal News Corpus 2022：通过及时基于迅速的几种方法的有效因果关系识别

IDIAPers @ Causal News Corpus 2022: Efficient Causal Relation Identification Through a Prompt-based Few-shot Approach

论文作者

Burdisso, Sergio, Zuluaga-Gomez, Juan, Villatoro-Tello, Esau, Fajcik, Martin, Singh, Muskaan, Smrz, Pavel, Motlicek, Petr

论文摘要

在本文中，我们描述了我们参与Case-2022的子任务1，即与休闲新闻语料库的事件因果关系识别。我们通过在少数带注释的示例（即几次摄影配置）上利用一组简单但互补的技术来解决因果关系识别（CRI）任务。我们遵循一种基于及时的预测方法，用于微调LMS，其中CRI任务被视为掩盖语言建模问题（MLM）。这种方法允许LMS对MLM问题进行本地预先训练，可以直接生成对CRI特异性提示的文本响应。我们将此方法的性能与在整个数据集中训练的集合技术进行比较。我们表现最佳的提交仅进行了微调，每个班级只有256个实例，占所有可用数据的15.7％，但获得了第二好的精度（0.82），准确性（0.82）和F1得分（0.85）（0.85）非常接近赢家团队（0.86）报告的内容。

In this paper, we describe our participation in the subtask 1 of CASE-2022, Event Causality Identification with Casual News Corpus. We address the Causal Relation Identification (CRI) task by exploiting a set of simple yet complementary techniques for fine-tuning language models (LMs) on a small number of annotated examples (i.e., a few-shot configuration). We follow a prompt-based prediction approach for fine-tuning LMs in which the CRI task is treated as a masked language modeling problem (MLM). This approach allows LMs natively pre-trained on MLM problems to directly generate textual responses to CRI-specific prompts. We compare the performance of this method against ensemble techniques trained on the entire dataset. Our best-performing submission was fine-tuned with only 256 instances per class, 15.7% of the all available data, and yet obtained the second-best precision (0.82), third-best accuracy (0.82), and an F1-score (0.85) very close to what was reported by the winner team (0.86).

下载PDF全文

下载文献需遵守相关版权规定

论文标题