论文标题

深入潜入上下文感知的神经机器翻译

Diving Deep into Context-Aware Neural Machine Translation

论文作者

Huo, Jingjing, Herold, Christian, Gao, Yingbo, Dahlmann, Leonard, Khadivi, Shahram, Ney, Hermann

论文摘要

上下文感知的神经机器翻译(NMT)是通过使用其他上下文(例如文档级翻译或具有元信息)来提高翻译质量的有希望的方向。尽管存在各种架构和分析,但尚未很好地探索不同上下文感知的NMT模型的有效性。本文分析了文档级NMT模型在四个不同域中具有不同数量的平行文档级双语数据的性能。我们进行了一组全面的实验,以研究文档级NMT的影响。我们发现没有单一的最佳方法来记录级别的NMT,而是在不同的任务上脱颖而出。查看特定于任务的问题,例如代词解决方案或标题翻译,我们发现上下文感知系统的改进,即使在BLEU等语料库级指标没有显着改进的情况下,也会发现。我们还表明,文档级的反向翻译大大有助于弥补缺乏文档级别的双文本。

Context-aware neural machine translation (NMT) is a promising direction to improve the translation quality by making use of the additional context, e.g., document-level translation, or having meta-information. Although there exist various architectures and analyses, the effectiveness of different context-aware NMT models is not well explored yet. This paper analyzes the performance of document-level NMT models on four diverse domains with a varied amount of parallel document-level bilingual data. We conduct a comprehensive set of experiments to investigate the impact of document-level NMT. We find that there is no single best approach to document-level NMT, but rather that different architectures come out on top on different tasks. Looking at task-specific problems, such as pronoun resolution or headline translation, we find improvements in the context-aware systems, even in cases where the corpus-level metrics like BLEU show no significant improvement. We also show that document-level back-translation significantly helps to compensate for the lack of document-level bi-texts.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源