优化端到端人工智能管道的策略

论文标题

优化端到端人工智能管道的策略

Strategies for Optimizing End-to-End Artificial Intelligence Pipelines on Intel Xeon Processors

论文作者

Arunachalam, Meena, Sanghavi, Vrushabh, Yao, Yi A, Zhou, Yi A, Wang, Lifeng A, Wen, Zongru, Ammbashankar, Niroop, Wang, Ning W, Mohammad, Fahim

论文摘要

端到端（E2E）人工智能（AI）管道由多个阶段组成，包括数据预处理，数据摄入，定义和训练模型，超参数优化，部署，推理，推理，后处理，然后进行下游分析。为了获得有效的E2E工作流程，需要优化几乎所有管道阶段。英特尔Xeon处理器具有较大的记忆能力，与AI加速捆绑在一起（例如Intel Deep Learning Boost），非常适合并行运行多个培训和推理管道实例，并且拥有较低的总拥有成本（TCO）。为了展示Xeon处理器的性能，我们应用了全面的优化策略，以及在计算机视觉，NLP，推荐系统等领域的各种E2E管道上的软件和硬件加速度，等等。我们能够实现性能提高，范围从不同的E2E管道的1.8 x到81.7倍。在本文中，我们将重点介绍我们在Intel Xeon处理器上采用的优化策略，并具有八个不同的E2E管道。

End-to-end (E2E) artificial intelligence (AI) pipelines are composed of several stages including data preprocessing, data ingestion, defining and training the model, hyperparameter optimization, deployment, inference, postprocessing, followed by downstream analyses. To obtain efficient E2E workflow, it is required to optimize almost all the stages of pipeline. Intel Xeon processors come with large memory capacities, bundled with AI acceleration (e.g., Intel Deep Learning Boost), well suited to run multiple instances of training and inference pipelines in parallel and has low total cost of ownership (TCO). To showcase the performance on Xeon processors, we applied comprehensive optimization strategies coupled with software and hardware acceleration on variety of E2E pipelines in the areas of Computer Vision, NLP, Recommendation systems, etc. We were able to achieve a performance improvement, ranging from 1.8x to 81.7x across different E2E pipelines. In this paper, we will be highlighting the optimization strategies adopted by us to achieve this performance on Intel Xeon processors with a set of eight different E2E pipelines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题