深入潜入上下文感知的神经机器翻译

论文标题

深入潜入上下文感知的神经机器翻译

Diving Deep into Context-Aware Neural Machine Translation

论文作者

Huo, Jingjing, Herold, Christian, Gao, Yingbo, Dahlmann, Leonard, Khadivi, Shahram, Ney, Hermann

论文摘要

上下文感知的神经机器翻译（NMT）是通过使用其他上下文（例如文档级翻译或具有元信息）来提高翻译质量的有希望的方向。尽管存在各种架构和分析，但尚未很好地探索不同上下文感知的NMT模型的有效性。本文分析了文档级NMT模型在四个不同域中具有不同数量的平行文档级双语数据的性能。我们进行了一组全面的实验，以研究文档级NMT的影响。我们发现没有单一的最佳方法来记录级别的NMT，而是在不同的任务上脱颖而出。查看特定于任务的问题，例如代词解决方案或标题翻译，我们发现上下文感知系统的改进，即使在BLEU等语料库级指标没有显着改进的情况下，也会发现。我们还表明，文档级的反向翻译大大有助于弥补缺乏文档级别的双文本。

Context-aware neural machine translation (NMT) is a promising direction to improve the translation quality by making use of the additional context, e.g., document-level translation, or having meta-information. Although there exist various architectures and analyses, the effectiveness of different context-aware NMT models is not well explored yet. This paper analyzes the performance of document-level NMT models on four diverse domains with a varied amount of parallel document-level bilingual data. We conduct a comprehensive set of experiments to investigate the impact of document-level NMT. We find that there is no single best approach to document-level NMT, but rather that different architectures come out on top on different tasks. Looking at task-specific problems, such as pronoun resolution or headline translation, we find improvements in the context-aware systems, even in cases where the corpus-level metrics like BLEU show no significant improvement. We also show that document-level back-translation significantly helps to compensate for the lack of document-level bi-texts.

下载PDF全文

下载文献需遵守相关版权规定

论文标题