生物医学文本挖掘的BERT多任务学习的经验研究

论文标题

生物医学文本挖掘的BERT多任务学习的经验研究

An Empirical Study of Multi-Task Learning on BERT for Biomedical Text Mining

论文作者

Peng, Yifan, Chen, Qingyu, Lu, Zhiyong

论文摘要

多任务学习（MTL）在自然语言处理应用中取得了巨大的成功。在这项工作中，我们研究了一个多任务学习模型，该模型具有多个解码器，涉及生物医学和临床自然语言处理任务，例如文本相似性，关系提取，命名实体识别和文本推断。我们的经验结果表明，MTL微调模型在生物医学和临床领域的最先进变压器模型（例如BERT及其变体）分别优于2.0％和1.3％。成对MTL进一步展示了有关哪些任务可以改善或减少其他任务的更多详细信息。在研究人员处于选择新问题模型的麻烦中，这一点尤其有用。代码和模型可在https://github.com/ncbi-nlp/bluebert上公开获得

Multi-task learning (MTL) has achieved remarkable success in natural language processing applications. In this work, we study a multi-task learning model with multiple decoders on varieties of biomedical and clinical natural language processing tasks such as text similarity, relation extraction, named entity recognition, and text inference. Our empirical results demonstrate that the MTL fine-tuned models outperform state-of-the-art transformer models (e.g., BERT and its variants) by 2.0% and 1.3% in biomedical and clinical domains, respectively. Pairwise MTL further demonstrates more details about which tasks can improve or decrease others. This is particularly helpful in the context that researchers are in the hassle of choosing a suitable model for new problems. The code and models are publicly available at https://github.com/ncbi-nlp/bluebert

下载PDF全文

下载文献需遵守相关版权规定

论文标题