论文标题

用变压器测量神经证实产生的系统概括

Measuring Systematic Generalization in Neural Proof Generation with Transformers

论文作者

Gontier, Nicolas, Sinha, Koustuv, Reddy, Siva, Pal, Christopher

论文摘要

我们有兴趣了解当以自然语言的形式进行知识培训时,变形金刚(TLM)如何执行推理任务。我们在自然语言的逻辑推理任务上研究了他们的系统概括能力,这涉及对基于一阶逻辑证明的实体之间的关系进行推理。具体来说,我们通过利用TLM来生成自然语言证明来执行软定理。我们测试生成的证据,以确保逻辑一致性以及最终推断的准确性。当对长期训练的序列进行评估时,我们会观察到长度属性问题。但是,我们观察到TLM在暴露于更长,详尽的证据之后提高了其概括性能。此外,我们发现,与向前链的同行相比,TLM可以使用向后链的证据更好地概括,同时他们发现更容易生成前进的链式锁定证明。我们观察到,未经训练以生成证据的模型可以根据较长的证据推广到问题。这表明变压器具有更难解释的有效内部推理策略。这些结果突出了TLM在逻辑推理的系统中的系统概括行为,我们认为这项工作激发了对其潜在推理策略的更深入检查。

We are interested in understanding how well Transformer language models (TLMs) can perform reasoning tasks when trained on knowledge encoded in the form of natural language. We investigate their systematic generalization abilities on a logical reasoning task in natural language, which involves reasoning over relationships between entities grounded in first-order logical proofs. Specifically, we perform soft theorem-proving by leveraging TLMs to generate natural language proofs. We test the generated proofs for logical consistency, along with the accuracy of the final inference. We observe length-generalization issues when evaluated on longer-than-trained sequences. However, we observe TLMs improve their generalization performance after being exposed to longer, exhaustive proofs. In addition, we discover that TLMs are able to generalize better using backward-chaining proofs compared to their forward-chaining counterparts, while they find it easier to generate forward chaining proofs. We observe that models that are not trained to generate proofs are better at generalizing to problems based on longer proofs. This suggests that Transformers have efficient internal reasoning strategies that are harder to interpret. These results highlight the systematic generalization behavior of TLMs in the context of logical reasoning, and we believe this work motivates deeper inspection of their underlying reasoning strategies.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源