Babybear：廉价的推理分类用于昂贵的语言模型

论文标题

Babybear：廉价的推理分类用于昂贵的语言模型

BabyBear: Cheap inference triage for expensive language models

论文作者

Khalili, Leila, You, Yao, Bohannon, John

论文摘要

变压器语言模型比以前的模型提供了更高的精度，但是它们在计算和环境上都很昂贵。借用计算机视觉级联模型的概念，我们介绍了Babybear，这是一种自然语言处理模型（NLP）任务的框架，以最大程度地降低成本。核心策略是推理分类，当级联中最便宜的模型实现足够高的信心预测时，早期就退出了。我们在几个与文档分类和实体识别有关的开源数据集上测试Babybear。我们发现，对于常见的NLP任务，可以通过廉价，快速的模型来实现高比例的推理负载，这些模型通过观察深度学习模型而学习。这使我们能够将大规模分类工作的计算成本降低超过50％，同时保留整体准确性。对于指定的实体识别，我们节省了33％的深度学习计算，同时将F1得分高于Conll基准的95％。

Transformer language models provide superior accuracy over previous models but they are computationally and environmentally expensive. Borrowing the concept of model cascading from computer vision, we introduce BabyBear, a framework for cascading models for natural language processing (NLP) tasks to minimize cost. The core strategy is inference triage, exiting early when the least expensive model in the cascade achieves a sufficiently high-confidence prediction. We test BabyBear on several open source data sets related to document classification and entity recognition. We find that for common NLP tasks a high proportion of the inference load can be accomplished with cheap, fast models that have learned by observing a deep learning model. This allows us to reduce the compute cost of large-scale classification jobs by more than 50% while retaining overall accuracy. For named entity recognition, we save 33% of the deep learning compute while maintaining an F1 score higher than 95% on the CoNLL benchmark.

下载PDF全文

下载文献需遵守相关版权规定

论文标题