论文标题
RNN可以使用最佳内存生成有限的层次结构语言
RNNs can generate bounded hierarchical languages with optimal memory
论文作者
论文摘要
复发性神经网络从经验上产生具有高句法忠诚度的自然语言。但是,从理论上讲,他们的成功并不是很好的理解。我们为这一成功提供了理论上的见解,在有限的环境中证明了RNN可以有效地生成有限的层次语言,以反映自然语法的脚手架。我们介绍Dyck-($ k $,$ m $),nest括号($ k $类型)的语言和$ m $绑定的嵌套深度,反映了自然语言语法的有界内存需求和长距离依赖性。最著名的结果使用$ o(k^{\ frac {m} {2}}})$内存(隐藏单位)来生成这些语言。我们证明,具有$ O(M \ log K)$隐藏单元的RNN足够,通过明确的结构来减少内存的指数减少。最后,我们表明,即使使用无限的计算,也没有算法也可以使用$ O(M \ log K)$隐藏单元就足够了。
Recurrent neural networks empirically generate natural language with high syntactic fidelity. However, their success is not well-understood theoretically. We provide theoretical insight into this success, proving in a finite-precision setting that RNNs can efficiently generate bounded hierarchical languages that reflect the scaffolding of natural language syntax. We introduce Dyck-($k$,$m$), the language of well-nested brackets (of $k$ types) and $m$-bounded nesting depth, reflecting the bounded memory needs and long-distance dependencies of natural language syntax. The best known results use $O(k^{\frac{m}{2}})$ memory (hidden units) to generate these languages. We prove that an RNN with $O(m \log k)$ hidden units suffices, an exponential reduction in memory, by an explicit construction. Finally, we show that no algorithm, even with unbounded computation, can suffice with $o(m \log k)$ hidden units.