论文标题
是什么使阅读理解问题变得困难?
What Makes Reading Comprehension Questions Difficult?
论文作者
论文摘要
为了使自然语言理解基准在研究中有用,它必须由多样化且难以区分目前和近乎未来的最先进系统的示例组成。但是,我们尚不知道如何最好地选择文本源来收集各种具有挑战性的例子。在这项研究中,我们众包从七个定性不同的来源获取的段落中众包阅读理解问题,分析了哪些段落的属性有助于收集的示例的难度和问题类型。令我们惊讶的是,我们发现通道的来源,长度和可读性措施不会显着影响问题的难度。通过我们对七种推理类型的手动注释,我们观察到段落来源和推理类型之间的几个趋势,例如,在为技术段落编写的问题中,逻辑推理更经常需要。这些结果表明,在创建新的基准数据集时,选择各种段落可以帮助确保各种各样的问题类型,但是该段落的困难不必是优先事项。
For a natural language understanding benchmark to be useful in research, it has to consist of examples that are diverse and difficult enough to discriminate among current and near-future state-of-the-art systems. However, we do not yet know how best to select text sources to collect a variety of challenging examples. In this study, we crowdsource multiple-choice reading comprehension questions for passages taken from seven qualitatively distinct sources, analyzing what attributes of passages contribute to the difficulty and question types of the collected examples. To our surprise, we find that passage source, length, and readability measures do not significantly affect question difficulty. Through our manual annotation of seven reasoning types, we observe several trends between passage sources and reasoning types, e.g., logical reasoning is more often required in questions written for technical passages. These results suggest that when creating a new benchmark dataset, selecting a diverse set of passages can help ensure a diverse range of question types, but that passage difficulty need not be a priority.