论文标题

Babybear:廉价的推理分类用于昂贵的语言模型

BabyBear: Cheap inference triage for expensive language models

论文作者

Khalili, Leila, You, Yao, Bohannon, John

论文摘要

变压器语言模型比以前的模型提供了更高的精度,但是它们在计算和环境上都很昂贵。借用计算机视觉级联模型的概念,我们介绍了Babybear,这是一种自然语言处理模型(NLP)任务的框架,以最大程度地降低成本。核心策略是推理分类,当级联中最便宜的模型实现足够高的信心预测时,早期就退出了。我们在几个与文档分类和实体识别有关的开源数据集上测试Babybear。我们发现,对于常见的NLP任务,可以通过廉价,快速的模型来实现高比例的推理负载,这些模型通过观察深度学习模型而学习。这使我们能够将大规模分类工作的计算成本降低超过50%,同时保留整体准确性。对于指定的实体识别,我们节省了33%的深度学习计算,同时将F1得分高于Conll基准的95%。

Transformer language models provide superior accuracy over previous models but they are computationally and environmentally expensive. Borrowing the concept of model cascading from computer vision, we introduce BabyBear, a framework for cascading models for natural language processing (NLP) tasks to minimize cost. The core strategy is inference triage, exiting early when the least expensive model in the cascade achieves a sufficiently high-confidence prediction. We test BabyBear on several open source data sets related to document classification and entity recognition. We find that for common NLP tasks a high proportion of the inference load can be accomplished with cheap, fast models that have learned by observing a deep learning model. This allows us to reduce the compute cost of large-scale classification jobs by more than 50% while retaining overall accuracy. For named entity recognition, we save 33% of the deep learning compute while maintaining an F1 score higher than 95% on the CoNLL benchmark.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源