在您行动之前先思考：组成概括的简单基准

论文标题

在您行动之前先思考：组成概括的简单基准

Think before you act: A simple baseline for compositional generalization

论文作者

Heinze-Deml, Christina, Bouchacourt, Diane

论文摘要

与有能力重组熟悉表达以创建新颖的人的人相反，现代神经网络很难做到这一点。最近，通过引入基准数据集“ GSCAN”（Ruis等，2020），旨在在基础语言理解中评估模型在组成概括时的性能。在这项工作中，我们通过提出一个简单的模型来挑战GSCAN基准测试，该模型在两个GSCAN测试拆分上实现出色的性能。我们的模型基于这样的观察，即要在GSCAN任务上取得成功，代理必须（i）在（ii）成功导航之前识别目标对象（思考）（ACT）。具体而言，我们提出了（Ruis等，2020）的基线模型的注意力启发的修改，以及辅助损失，考虑了步骤（i）和（ii）的顺序性质。尽管我们的方法对两个组成任务进行了微不足道的解决，但我们还发现其他任务仍未解决，从而验证了GSCAN作为评估模型组成能力的基准的相关性。

Contrarily to humans who have the ability to recombine familiar expressions to create novel ones, modern neural networks struggle to do so. This has been emphasized recently with the introduction of the benchmark dataset "gSCAN" (Ruis et al. 2020), aiming to evaluate models' performance at compositional generalization in grounded language understanding. In this work, we challenge the gSCAN benchmark by proposing a simple model that achieves surprisingly good performance on two of the gSCAN test splits. Our model is based on the observation that, to succeed on gSCAN tasks, the agent must (i) identify the target object (think) before (ii) navigating to it successfully (act). Concretely, we propose an attention-inspired modification of the baseline model from (Ruis et al. 2020), together with an auxiliary loss, that takes into account the sequential nature of steps (i) and (ii). While two compositional tasks are trivially solved with our approach, we also find that the other tasks remain unsolved, validating the relevance of gSCAN as a benchmark for evaluating models' compositional abilities.

下载PDF全文

下载文献需遵守相关版权规定

论文标题