论文标题
Idiapers @ Causal News Corpus 2022:通过及时基于迅速的几种方法的有效因果关系识别
IDIAPers @ Causal News Corpus 2022: Efficient Causal Relation Identification Through a Prompt-based Few-shot Approach
论文作者
论文摘要
在本文中,我们描述了我们参与Case-2022的子任务1,即与休闲新闻语料库的事件因果关系识别。我们通过在少数带注释的示例(即几次摄影配置)上利用一组简单但互补的技术来解决因果关系识别(CRI)任务。我们遵循一种基于及时的预测方法,用于微调LMS,其中CRI任务被视为掩盖语言建模问题(MLM)。这种方法允许LMS对MLM问题进行本地预先训练,可以直接生成对CRI特异性提示的文本响应。我们将此方法的性能与在整个数据集中训练的集合技术进行比较。我们表现最佳的提交仅进行了微调,每个班级只有256个实例,占所有可用数据的15.7%,但获得了第二好的精度(0.82),准确性(0.82)和F1得分(0.85)(0.85)非常接近赢家团队(0.86)报告的内容。
In this paper, we describe our participation in the subtask 1 of CASE-2022, Event Causality Identification with Casual News Corpus. We address the Causal Relation Identification (CRI) task by exploiting a set of simple yet complementary techniques for fine-tuning language models (LMs) on a small number of annotated examples (i.e., a few-shot configuration). We follow a prompt-based prediction approach for fine-tuning LMs in which the CRI task is treated as a masked language modeling problem (MLM). This approach allows LMs natively pre-trained on MLM problems to directly generate textual responses to CRI-specific prompts. We compare the performance of this method against ensemble techniques trained on the entire dataset. Our best-performing submission was fine-tuned with only 256 instances per class, 15.7% of the all available data, and yet obtained the second-best precision (0.82), third-best accuracy (0.82), and an F1-score (0.85) very close to what was reported by the winner team (0.86).