（心理 - ）语言特征符合变压器模型，以改进可解释和可控的文本简化

论文标题

（心理 - ）语言特征符合变压器模型，以改进可解释和可控的文本简化

(Psycho-)Linguistic Features Meet Transformer Models for Improved Explainable and Controllable Text Simplification

论文作者

Qiao, Yu, Li, Xiaofei, Wiechmann, Daniel, Kerz, Elma

论文摘要

最先进的文本简化（TS）系统采用端到端神经网络模型直接生成输入文本的简化版本，并且通常充当黑框。此外，在同质性的假设下，TS通常被视为通用通用任务，其中相同的简化适合所有人。但是，近年来，人们对将简化技术适应不同目标群体的特定需求的需求越来越多。在这项工作中，我们旨在通过两种方式来推进有关可解释和可控制的TS的当前研究：首先，基于最近提出的工作以提高TS系统的透明度，我们使用大量（心理）语言特征与预培训的语言模型结合使用，以改善可解释的复杂性预测。其次，根据此初步任务的结果，我们扩展了最先进的SEQ2SEQ TS模型，访问，以明确控制十个属性。实验的结果表明（1）我们的方法改善了预测可解释复杂性的最新模型的性能，以及（2）在十个属性上明确调节SEQ2SEQ模型，从而导致内部内部和外域设置的性能显着改善。

State-of-the-art text simplification (TS) systems adopt end-to-end neural network models to directly generate the simplified version of the input text, and usually function as a blackbox. Moreover, TS is usually treated as an all-purpose generic task under the assumption of homogeneity, where the same simplification is suitable for all. In recent years, however, there has been increasing recognition of the need to adapt the simplification techniques to the specific needs of different target groups. In this work, we aim to advance current research on explainable and controllable TS in two ways: First, building on recently proposed work to increase the transparency of TS systems, we use a large set of (psycho-)linguistic features in combination with pre-trained language models to improve explainable complexity prediction. Second, based on the results of this preliminary task, we extend a state-of-the-art Seq2Seq TS model, ACCESS, to enable explicit control of ten attributes. The results of experiments show (1) that our approach improves the performance of state-of-the-art models for predicting explainable complexity and (2) that explicitly conditioning the Seq2Seq model on ten attributes leads to a significant improvement in performance in both within-domain and out-of-domain settings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题