Flexserve：部署Pytorch型号作为灵活的休息端点

论文标题

Flexserve：部署Pytorch型号作为灵活的休息端点

FlexServe: Deployment of PyTorch Models as Flexible REST Endpoints

论文作者

Verenich, Edward, Velasquez, Alvaro, Murshed, M. G. Sarwar, Hussain, Faraz

论文摘要

通过使用基于云的机器学习服务和代表性状态转移架构设计，将人工智能功能集成到现代软件系统中。但是，关于基本模型出处以及对模型演变缺乏控制的信息不足，这是对在具有严格安全要求的许多操作环境中更广泛地采用这些服务的障碍。此外，诸如TensorFlow服务之类的工具可以将模型部署为恢复的端点，但需要对Pytorch模型作为这些动态计算图的易错转换。这与TensorFlow的静态计算图相反。为了使Pytorch模型的快速部署没有中间转换，我们开发了Flexserve，这是一个简单的库，用于部署具有灵活批处理的多模型合奏。

The integration of artificial intelligence capabilities into modern software systems is increasingly being simplified through the use of cloud-based machine learning services and representational state transfer architecture design. However, insufficient information regarding underlying model provenance and the lack of control over model evolution serve as an impediment to the more widespread adoption of these services in many operational environments which have strict security requirements. Furthermore, tools such as TensorFlow Serving allow models to be deployed as RESTful endpoints, but require error-prone transformations for PyTorch models as these dynamic computational graphs. This is in contrast to the static computational graphs of TensorFlow. To enable rapid deployments of PyTorch models without intermediate transformations we have developed FlexServe, a simple library to deploy multi-model ensembles with flexible batching.

下载PDF全文

下载文献需遵守相关版权规定

论文标题