令牌级的神经机器翻译自适应培训

论文标题

令牌级的神经机器翻译自适应培训

Token-level Adaptive Training for Neural Machine Translation

论文作者

Gu, Shuhao, Zhang, Jinchao, Meng, Fandong, Feng, Yang, Xie, Wanying, Zhou, Jie, Yu, Dong

论文摘要

自然语言中存在一种代币的不平衡现象，因为不同的令牌以不同的频率出现，从而导致神经机器翻译（NMT）中令牌的学习困难不同。香草NMT模型通常对具有不同频率的目标令牌采用微不足道的同等权力目标，并且与黄金令牌分布相比，倾向于产生更多的高频令牌和较低的低频令牌。但是，低频令牌可能会携带关键的语义信息，一旦被忽视，它们将影响翻译质量。在本文中，我们根据令牌频率探讨了目标令牌级别的自适应目标，以在训练过程中为每个目标令牌分配适当的权重。我们的目的是，可以在目标中使用更大的权重来分配那些有意义但相对较低的单词，以鼓励模型更多地关注这些令牌。我们的方法在ZH-EN，EN-RO和EN-DE翻译任务上的翻译质量始终如一地提高，尤其是在包含更多低频代币的句子上，我们可以分别获得1.68、1.02和0.52 BLEU的句子，分别与基线相比。进一步的分析表明，我们的方法还可以改善翻译的词汇多样性。

There exists a token imbalance phenomenon in natural language as different tokens appear with different frequencies, which leads to different learning difficulties for tokens in Neural Machine Translation (NMT). The vanilla NMT model usually adopts trivial equal-weighted objectives for target tokens with different frequencies and tends to generate more high-frequency tokens and less low-frequency tokens compared with the golden token distribution. However, low-frequency tokens may carry critical semantic information that will affect the translation quality once they are neglected. In this paper, we explored target token-level adaptive objectives based on token frequencies to assign appropriate weights for each target token during training. We aimed that those meaningful but relatively low-frequency words could be assigned with larger weights in objectives to encourage the model to pay more attention to these tokens. Our method yields consistent improvements in translation quality on ZH-EN, EN-RO, and EN-DE translation tasks, especially on sentences that contain more low-frequency tokens where we can get 1.68, 1.02, and 0.52 BLEU increases compared with baseline, respectively. Further analyses show that our method can also improve the lexical diversity of translation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题