论文标题
通过主题信息的离散潜在变量学习语义文本相似性
Learning Semantic Textual Similarity via Topic-informed Discrete Latent Variables
论文作者
论文摘要
最近,离散的潜在变量模型已经在自然语言处理(NLP)和计算机视觉(CV)中引起了人们的兴趣,这归因于它们与表示学习中连续的表现相当的性能,同时在预测中更容易解释。在本文中,我们为语义文本相似性开发了一个主题信息的离散潜在变量模型,该模型通过向量量化学习了句子对代表的共享潜在空间。与以前仅限于本地语义上下文的模型相比,我们的模型可以通过主题建模探索更丰富的语义信息。我们通过将量化的表示形式注入基于变压器的语言模型,并具有精心设计的语义驱动的注意机制来进一步提高语义相似性的性能。我们通过在各种英语数据集的广泛实验中证明,我们的模型能够在语义文本相似性任务中超过几个强大的神经基准。
Recently, discrete latent variable models have received a surge of interest in both Natural Language Processing (NLP) and Computer Vision (CV), attributed to their comparable performance to the continuous counterparts in representation learning, while being more interpretable in their predictions. In this paper, we develop a topic-informed discrete latent variable model for semantic textual similarity, which learns a shared latent space for sentence-pair representation via vector quantization. Compared with previous models limited to local semantic contexts, our model can explore richer semantic information via topic modeling. We further boost the performance of semantic similarity by injecting the quantized representation into a transformer-based language model with a well-designed semantic-driven attention mechanism. We demonstrate, through extensive experiments across various English language datasets, that our model is able to surpass several strong neural baselines in semantic textual similarity tasks.