论文标题
后缀检索语言建模
Suffix Retrieval-Augmented Language Modeling
论文作者
论文摘要
因果语言建模(LM)使用单词历史记录来预测下一个单词。另一方面,伯特(Bert)在句子中利用双向单词信息来预测掩盖位置的单词。尽管BERT在序列编码方面有效,但本质上是非毒物的,不是为序列产生而设计的。在本文中,我们提出了一种新型的语言模型,后缀检索了lm(surealm),该模型以自动回归方式模拟双向上下文效应。 Surealm使用嵌入式检索器在数据存储中搜索训练句子,该数据存储在序列生成期间共享类似单词的历史记录。特别是,检索到的句子的后缀部分模仿“未来”上下文。我们在DSTC9口语对话语料库上评估了我们提出的模型,并显示了与竞争基线相比的验证和测试集的有希望的单词困惑。
Causal language modeling (LM) uses word history to predict the next word. BERT, on the other hand, makes use of bi-directional word information in a sentence to predict words at masked positions. While BERT is effective in sequence encoding, it is non-causal by nature and is not designed for sequence generation. In this paper, we propose a novel language model, SUffix REtrieval-Augmented LM (SUREALM), that simulates a bi-directional contextual effect in an autoregressive manner. SUREALM employs an embedding retriever to search for training sentences in a data store that share similar word history during sequence generation. In particular, the suffix portions of the retrieved sentences mimick the "future" context. We evaluated our proposed model on the DSTC9 spoken dialogue corpus and showed promising word perplexity reduction on the validation and test set compared to competitive baselines.