机器学习预测查询的端到端优化

论文标题

机器学习预测查询的端到端优化

End-to-end Optimization of Machine Learning Prediction Queries

论文作者

Park, Kwanghyun, Saur, Karla, Banda, Dalitso, Sen, Rathijit, Interlandi, Matteo, Karanasos, Konstantinos

论文摘要

预测查询在行业中广泛用于执行高级分析并从数据中获取见解。它们包括一个数据处理零件（例如，用于连接，过滤，清洁，特征数据集）和一个机器学习（ML）部分，调用一个或多个训练有素的模型来执行预测。到目前为止，这些部分已被隔离地进行了优化，为优化留下了很大的机会。我们提出了Raven，这是一种可优化预测查询的生产的系统。 Raven遵循企业建筑趋势的数据和ML运行时间。它依赖于统一的中间表示，该表示同时捕获单个图结构中的数据和ML运算符，以解锁两个优化家族。首先，它采用逻辑优化，在数据部分（以及基础数据的属性）和ML零件之间传递信息来相互优化。其次，它引入了逻辑到物理转换，使操作员可以在不同的运行时间（关系，ML和DNN）和硬件（CPU，GPU）上执行。新颖的数据驱动优化决定了查询每个部分用于实现最佳性能的运行时。我们的评估表明，Raven分别提高了Apache Spark和SQL Server上的预测查询的性能，分别提高了13.1倍和330x。对于GPU加速度有益的复杂模型，与最先进的系统相比，Raven提供多达8倍的速度。

Prediction queries are widely used across industries to perform advanced analytics and draw insights from data. They include a data processing part (e.g., for joining, filtering, cleaning, featurizing the datasets) and a machine learning (ML) part invoking one or more trained models to perform predictions. These parts have so far been optimized in isolation, leaving significant opportunities for optimization unexplored. We present Raven, a production-ready system for optimizing prediction queries. Raven follows the enterprise architectural trend of collocating data and ML runtimes. It relies on a unified intermediate representation that captures both data and ML operators in a single graph structure to unlock two families of optimizations. First, it employs logical optimizations that pass information between the data part (and the properties of the underlying data) and the ML part to optimize each other. Second, it introduces logical-to-physical transformations that allow operators to be executed on different runtimes (relational, ML, and DNN) and hardware (CPU, GPU). Novel data-driven optimizations determine the runtime to be used for each part of the query to achieve optimal performance. Our evaluation shows that Raven improves performance of prediction queries on Apache Spark and SQL Server by up to 13.1x and 330x, respectively. For complex models where GPU acceleration is beneficial, Raven provides up to 8x speedup compared to state-of-the-art systems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题