论文标题
Xtreaming:一种增量多维投影技术及其在流数据中的应用
Xtreaming: an incremental multidimensional projection technique and its application to streaming data
论文作者
论文摘要
由于不同信息源可以连续捕获或生成数据,例如传感器和社交媒体,流媒体数据应用程序变得越来越普遍。尽管有最近的进步,但由于流数据的短暂性质,大多数可视化方法,尤其是多维投影或降低降低技术,因此不能直接应用。当前,只有几种方法使用在线或增量策略,不断处理数据并更新可视化。尽管他们相对成功,但他们中的大多数都需要多次存储和访问数据,而不适合在数据持续增长的地方流式传输。其他人则不强制此类要求,但无法更新已经投影的数据的位置,可能导致视觉伪像。在本文中,我们提出了Xtreaming,这是一种新颖的增量投影技术,它不断地更新视觉表示以反映新的新兴结构或模式,而无需访问多维数据。我们的测试表明,与其他流和增量技术相比,Xtreaming在全球距离保存方面具有竞争力,但它的数量级更快。据我们所知,这是第一种能够不需要存储所有数据的预测来忠实地代表新的新兴结构的方法,从而提供了可靠的结果,以有效地有效地投影流数据。
Streaming data applications are becoming more common due to the ability of different information sources to continuously capture or produce data, such as sensors and social media. Despite recent advances, most visualization approaches, in particular, multidimensional projection or dimensionality reduction techniques, cannot be directly applied in such scenarios due to the transient nature of streaming data. Currently, only a few methods address this limitation using online or incremental strategies, continuously processing data, and updating the visualization. Despite their relative success, most of them impose the need for storing and accessing the data multiple times, not being appropriate for streaming where data continuously grow. Others do not impose such requirements but are not capable of updating the position of the data already projected, potentially resulting in visual artifacts. In this paper, we present Xtreaming, a novel incremental projection technique that continuously updates the visual representation to reflect new emerging structures or patterns without visiting the multidimensional data more than once. Our tests show that Xtreaming is competitive in terms of global distance preservation if compared to other streaming and incremental techniques, but it is orders of magnitude faster. To the best of our knowledge, it is the first methodology that is capable of evolving a projection to faithfully represent new emerging structures without the need to store all data, providing reliable results for efficiently and effectively projecting streaming data.