通过从语法中删除词典来组成神经机器翻译

论文标题

通过从语法中删除词典来组成神经机器翻译

Compositional Neural Machine Translation by Removing the Lexicon from Syntax

论文作者

Thrush, Tristan

论文摘要

自然语言的含义在很大程度上取决于其语法和文字。此外，有证据表明，人类通过将有关词典的知识与语法知识分开来处理话语。语义和神经科学的理论声称，在语法的表示中没有编码完整的单词含义。在本文中，我们提出了可以对LSTM编码器和解码器强制约束的神经单位。我们证明，我们的模型在各种领域都取得了竞争性能，包括语义解析，句法解析以及英语到普通话中文翻译。在这些情况下，我们的模型优于我们许多或所有指标的标准LSTM编码器和解码器体系结构。为了证明我们的模型可以实现词典和语法之间所需的分离，我们分析了其权重并探索不同神经模块时的行为。当损坏时，我们发现该模型显示出证据证明具有证据的知识扭曲。

The meaning of a natural language utterance is largely determined from its syntax and words. Additionally, there is evidence that humans process an utterance by separating knowledge about the lexicon from syntax knowledge. Theories from semantics and neuroscience claim that complete word meanings are not encoded in the representation of syntax. In this paper, we propose neural units that can enforce this constraint over an LSTM encoder and decoder. We demonstrate that our model achieves competitive performance across a variety of domains including semantic parsing, syntactic parsing, and English to Mandarin Chinese translation. In these cases, our model outperforms the standard LSTM encoder and decoder architecture on many or all of our metrics. To demonstrate that our model achieves the desired separation between the lexicon and syntax, we analyze its weights and explore its behavior when different neural modules are damaged. When damaged, we find that the model displays the knowledge distortions that aphasics are evidenced to have.

下载PDF全文

下载文献需遵守相关版权规定

论文标题