论文标题

绘制式缩放器:大时间序列的有效视觉分析

Plotly-Resampler: Effective Visual Analytics for Large Time Series

论文作者

Van Der Donckt, Jonas, Van Der Donckt, Jeroen, Deprost, Emiel, Van Hoecke, Sofie

论文摘要

视觉分析可以说是熟悉数据的最重要步骤。时间序列尤其是这种情况,因为此数据类型很难描述,并且在使用例如摘要统计信息时无法完全理解。要实现有效的时间序列可视化,必须满足四个要求;工具应为(1)交互式,(2)可扩展到数百万个数据点,(3)在常规数据科学环境中可集成,以及(4)高度可配置。我们观察到,开源Python可视化工具包在大多数视觉分析任务中赋予了数据科学家的能力,但是缺乏可扩展性和交互性的组合来实现有效的时间序列可视化。为了促进这些要求,我们创建了Plotly-Resampler,这是一个开源Python库。 Plotly-resampler是Plotly的Python绑定的附加组件,通过汇总基础数据,根据当前的图形视图来增强界面图在交互式工具包上的可扩展性。由于工具的反应性在定性上影响分析师在视觉探索和分析数据时,构建的绘制式缩放器的构建是活跃的。基准任务强调了我们的工具包在样本数量和时间序列方面如何比替代方案更好。此外,Plotly-Resmpler的灵活数据聚合功能为研究新型聚合技术铺平了道路。 Plotly-resampler的集成性及其可配置性,便利性和高可扩展性允许在您的日常Python环境中有效分析高频数据。

Visual analytics is arguably the most important step in getting acquainted with your data. This is especially the case for time series, as this data type is hard to describe and cannot be fully understood when using for example summary statistics. To realize effective time series visualization, four requirements have to be met; a tool should be (1) interactive, (2) scalable to millions of data points, (3) integrable in conventional data science environments, and (4) highly configurable. We observe that open source Python visualization toolkits empower data scientists in most visual analytics tasks, but lack the combination of scalability and interactivity to realize effective time series visualization. As a means to facilitate these requirements, we created Plotly-Resampler, an open source Python library. Plotly-Resampler is an add-on for Plotly's Python bindings, enhancing line chart scalability on top of an interactive toolkit by aggregating the underlying data depending on the current graph view. Plotly-Resampler is built to be snappy, as the reactivity of a tool qualitatively affects how analysts visually explore and analyze data. A benchmark task highlights how our toolkit scales better than alternatives in terms of number of samples and time series. Additionally, Plotly-Resampler's flexible data aggregation functionality paves the path towards researching novel aggregation techniques. Plotly-Resampler's integrability, together with its configurability, convenience, and high scalability, allows to effectively analyze high-frequency data in your day-to-day Python environment.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源