论文标题
一种内在探测的潜在可变化模型
A Latent-Variable Model for Intrinsic Probing
论文作者
论文摘要
预训练的情境化表示的成功促使研究人员分析了他们是否存在语言信息。确实,自然可以假设这些预训练的表示确实编码了一定程度的语言知识,因为它们对各种NLP任务进行了巨大的经验改进,这表明他们正在学习真正的语言概括。在这项工作中,我们专注于固有的探测,这是一种分析技术,该技术不仅要确定表示表示是否编码语言属性,而且还指出了该属性编码的位置。我们提出了一种新型的潜在变量公式,用于构建固有的探针并得出对数类似物的可缝隙变异近似。我们的结果表明,与文献中先前提出的两个固有探针相比,我们的模型具有通用性,并且可以产生更严格的相互信息估计。最后,我们发现预先培训的表示形式形成了跨语法的概念,我们发现了经验证据。
The success of pre-trained contextualized representations has prompted researchers to analyze them for the presence of linguistic information. Indeed, it is natural to assume that these pre-trained representations do encode some level of linguistic knowledge as they have brought about large empirical improvements on a wide variety of NLP tasks, which suggests they are learning true linguistic generalization. In this work, we focus on intrinsic probing, an analysis technique where the goal is not only to identify whether a representation encodes a linguistic attribute but also to pinpoint where this attribute is encoded. We propose a novel latent-variable formulation for constructing intrinsic probes and derive a tractable variational approximation to the log-likelihood. Our results show that our model is versatile and yields tighter mutual information estimates than two intrinsic probes previously proposed in the literature. Finally, we find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.