论文标题

Rubioroberta:俄罗斯语言生物医学文本挖掘的预培训的生物医学模型

RuBioRoBERTa: a pre-trained biomedical language model for Russian language biomedical text mining

论文作者

Yalunin, Alexander, Nesterov, Alexander, Umerenkov, Dmitriy

论文摘要

本文为俄罗斯语言生物医学文本挖掘(Rubiobert,Rubioroberta)提供了几种基于BERT的模型。这些模型已在俄罗斯生物医学领域的自由使用文本语料库中进行了预训练。通过这种预培训,我们的模型在Rumedbench上展示了最先进的结果 - 俄罗斯医学语言理解基准,该基准涵盖了各种任务,包括文本分类,问题答案,自然语言推断和指定的实体识别。

This paper presents several BERT-based models for Russian language biomedical text mining (RuBioBERT, RuBioRoBERTa). The models are pre-trained on a corpus of freely available texts in the Russian biomedical domain. With this pre-training, our models demonstrate state-of-the-art results on RuMedBench - Russian medical language understanding benchmark that covers a diverse set of tasks, including text classification, question answering, natural language inference, and named entity recognition.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源