论文标题
一个用于灵活和动态流分析的Edge-Cloud集成框架
An Edge-Cloud Integrated Framework for Flexible and Dynamic Stream Analytics
论文作者
论文摘要
随着物联网(IoT),边缘计算和云计算的普及,正在开发越来越多的流分析应用程序,包括实时趋势预测和物联网传感数据之上的对象检测。一种流行的流分析类型是基于深度学习模型的复发性神经网络(RNN)时间序列或序列数据预测和预测。与假定数据提前可用并且不会改变数据的传统分析不同,流分析涉及正在连续生成的数据,并且数据趋势/分布可能会发生变化(又称概念漂移),这将导致预测/预测准确性随着时间的推移而下降。另一个挑战是为流分析(Stream Analytics)找到最佳的资源提供,以达到良好的总体延迟。在本文中,我们研究了如何使用称为长期记忆(LSTM)的RNN模型来最佳利用边缘和云资源,以获得更好的准确性和流式分析的延迟。我们为混合流分析提出了一个新颖的边缘云集成框架,该框架支持云上边缘和高容量训练的低潜伏期推断。为了实现灵活的部署,我们研究了部署混合学习框架的不同方法,包括以边缘为中心,以云为中心和边缘云集成。此外,我们的混合学习框架可以根据历史数据进行预训练的LSTM模型和另一个根据最新数据定期重新训练的LSTM模型从LSTM模型中动态结合推理结果。使用现实世界和模拟流数据集,我们的实验表明,在延迟方面,提出的Edge-Cloud部署是所有三种部署类型中最好的。为了准确性,实验表明我们的动态学习方法在所有三种概念漂移方案中的所有学习方法中都表现最好。
With the popularity of Internet of Things (IoT), edge computing and cloud computing, more and more stream analytics applications are being developed including real-time trend prediction and object detection on top of IoT sensing data. One popular type of stream analytics is the recurrent neural network (RNN) deep learning model based time series or sequence data prediction and forecasting. Different from traditional analytics that assumes data are available ahead of time and will not change, stream analytics deals with data that are being generated continuously and data trend/distribution could change (a.k.a. concept drift), which will cause prediction/forecasting accuracy to drop over time. One other challenge is to find the best resource provisioning for stream analytics to achieve good overall latency. In this paper, we study how to best leverage edge and cloud resources to achieve better accuracy and latency for stream analytics using a type of RNN model called long short-term memory (LSTM). We propose a novel edge-cloud integrated framework for hybrid stream analytics that supports low latency inference on the edge and high capacity training on the cloud. To achieve flexible deployment, we study different approaches of deploying our hybrid learning framework including edge-centric, cloud-centric and edge-cloud integrated. Further, our hybrid learning framework can dynamically combine inference results from an LSTM model pre-trained based on historical data and another LSTM model re-trained periodically based on the most recent data. Using real-world and simulated stream datasets, our experiments show the proposed edge-cloud deployment is the best among all three deployment types in terms of latency. For accuracy, the experiments show our dynamic learning approach performs the best among all learning approaches for all three concept drift scenarios.