上下文单词表示对零照片转移的跨语性调整的影响

论文标题

上下文单词表示对零照片转移的跨语性调整的影响

The Impact of Cross-Lingual Adjustment of Contextual Word Representations on Zero-Shot Transfer

论文作者

Efimov, Pavel, Boytsov, Leonid, Arslanova, Elena, Braslavski, Pavel

论文摘要

大型多语言模型（例如Mbert或XLM-R）在各种IR和NLP任务中启用零拍传输。 Cao等。（2020年）提出了一种用于Mbert的跨语性调整的数据和计算方法，该方法使用小的平行语料库将相关单词的嵌入在彼此相似。他们证明这在NLI中对五种欧洲语言有效。相比之下，我们尝试了一种类型上多样化的语言（西班牙，俄语，越南和印地语），并将其最初的实现扩展到新任务（XSR，NER和QA）以及额外的培训制度（持续学习）。我们的研究重现了NLI的四种语言的收益，显示出NER，XSR和跨语性质量质量质量质量的改善会导致三种语言（尽管某些跨语性质量质量质量检查的增长在统计上并不显着），而单语语言QA的性能从未得到改善，有时从未提高，有时会降级。分析相关单词和不相关单词（跨语言）的上下文化嵌入之间的距离表明，微调会导致“忘记”一些跨语性的对准信息。基于此观察，我们使用持续学习进一步提高了NLI的性能。

Large multilingual language models such as mBERT or XLM-R enable zero-shot cross-lingual transfer in various IR and NLP tasks. Cao et al. (2020) proposed a data- and compute-efficient method for cross-lingual adjustment of mBERT that uses a small parallel corpus to make embeddings of related words across languages similar to each other. They showed it to be effective in NLI for five European languages. In contrast we experiment with a typologically diverse set of languages (Spanish, Russian, Vietnamese, and Hindi) and extend their original implementations to new tasks (XSR, NER, and QA) and an additional training regime (continual learning). Our study reproduced gains in NLI for four languages, showed improved NER, XSR, and cross-lingual QA results in three languages (though some cross-lingual QA gains were not statistically significant), while mono-lingual QA performance never improved and sometimes degraded. Analysis of distances between contextualized embeddings of related and unrelated words (across languages) showed that fine-tuning leads to "forgetting" some of the cross-lingual alignment information. Based on this observation, we further improved NLI performance using continual learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题