论文标题
GreenPlm:几乎没有成本
GreenPLM: Cross-Lingual Transfer of Monolingual Pre-Trained Language Models at Almost No Cost
论文作者
论文摘要
大型预训练的模型已彻底改变了自然语言处理(NLP)研究和应用,但是高培训成本和有限的数据资源阻止了他们的收益在世界所有语言的演讲者中平等地共享。为了解决大规模模型培训期间跨语言访问权限的问题,并减少了可持续性的能源消耗,本研究提出了一个名为GreenPlm的有效且节能的框架,该框架使用双语词典直接将一种语言的预培训的一种语言模型直接“翻译为另一种语言”,几乎没有其他成本。我们以18种语言的BERT模型来验证这种方法,并表明该框架可与其他高培训成本相当,甚至比其他启发式方法相当。此外,考虑到可用数据的有限数据的轻量级继续进行预培训,该框架的表现优于七分之一的测试语言中六种原始单语言模型,而预训练的训练减少了200倍。为了瞄准该假期(LNOB),我们的方法旨在大大减少语言和能源消耗之间的不平等现象。我们在此处公开提供代码和模型:\ url {https://github.com/qcznlp/greenplms}
Large pre-trained models have revolutionized natural language processing (NLP) research and applications, but high training costs and limited data resources have prevented their benefits from being shared equally amongst speakers of all the world's languages. To address issues of cross-linguistic access to such models and reduce energy consumption for sustainability during large-scale model training, this study proposes an effective and energy-efficient framework called GreenPLM that uses bilingual lexicons to directly "translate" pre-trained language models of one language into another at almost no additional cost. We validate this approach in 18 languages' BERT models and show that this framework is comparable to, if not better than, other heuristics with high training costs. In addition, given lightweight continued pre-training on limited data where available, this framework outperforms the original monolingual language models in six out of seven tested languages with up to 200x less pre-training efforts. Aiming at the Leave No One Behind Principle (LNOB), our approach manages to reduce inequalities between languages and energy consumption greatly. We make our codes and models publicly available here: \url{https://github.com/qcznlp/GreenPLMs}