论文标题
大数据处理中的智能资源管理的细粒度建模和优化
Fine-Grained Modeling and Optimization for Intelligent Resource Management in Big Data Processing
论文作者
论文摘要
生产规模的大数据处理提出了一个高度复杂的资源优化环境(RO),这对于满足分析用户的绩效目标和预算限制至关重要。 RO问题是具有挑战性的,因为它涉及一组决策(分区计数,平行实例在机器上的放置以及对每个实例的资源分配),需要多目标优化(MOO),并且由于必须满足严格的时间限制的大数据系统的规模和复杂性,因此更加复杂。本文提出了一个基于MaxCompute的集成系统,可通过细粒度实例级建模和优化支持多目标资源优化。我们提出了一种新的体系结构,将RO分解为一系列简单的问题,新的细粒度预测模型以及新颖的优化方法,这些方法利用这些模型来在层次MOO框架中提出有效的实例级建议。使用生产工作负载进行评估表明,与当前的优化器和调度程序相比,我们的新RO系统可以同时降低37-72%的潜伏期和43-78%的成本,同时以0.02-0.23的运行。
Big data processing at the production scale presents a highly complex environment for resource optimization (RO), a problem crucial for meeting performance goals and budgetary constraints of analytical users. The RO problem is challenging because it involves a set of decisions (the partition count, placement of parallel instances on machines, and resource allocation to each instance), requires multi-objective optimization (MOO), and is compounded by the scale and complexity of big data systems while having to meet stringent time constraints for scheduling. This paper presents a MaxCompute-based integrated system to support multi-objective resource optimization via fine-grained instance-level modeling and optimization. We propose a new architecture that breaks RO into a series of simpler problems, new fine-grained predictive models, and novel optimization methods that exploit these models to make effective instance-level recommendations in a hierarchical MOO framework. Evaluation using production workloads shows that our new RO system could reduce 37-72% latency and 43-78% cost at the same time, compared to the current optimizer and scheduler, while running in 0.02-0.23s.