论文标题
逆强化学习文本摘要
Inverse Reinforcement Learning for Text Summarization
论文作者
论文摘要
我们将逆增强学习(IRL)作为培训抽象摘要模型的有效范式,模仿人类摘要行为。我们的IRL模型使用一套重要的子回报来估算奖励功能,以进行汇总,并同时优化策略网络。跨不同域(CNN/Dailymail和Wikihow)的数据集的实验结果以及各种模型大小(Bart-Base和Bart-large)证明了我们提出的IRL模型对汇总模型的优越性,而不是MLE和RL基准。所产生的摘要表现出与人工制作的黄金参考的相似性,在胭脂,覆盖,新颖性,压缩比,事实和人类评估等指标上的表现优于MLE和RL基准。
We introduce inverse reinforcement learning (IRL) as an effective paradigm for training abstractive summarization models, imitating human summarization behaviors. Our IRL model estimates the reward function using a suite of important sub-rewards for summarization and concurrently optimizes the policy network. Experimental results across datasets in different domains (CNN/DailyMail and WikiHow) and various model sizes (BART-base and BART-large) demonstrate the superiority of our proposed IRL model for summarization over MLE and RL baselines. The resulting summaries exhibit greater similarity to human-crafted gold references, outperforming MLE and RL baselines on metrics such as ROUGE, coverage, novelty, compression ratio, factuality, and human evaluations.