论文标题

使用拓扑建模视觉密度对散点图中群集感知的影响

Modeling the Influence of Visual Density on Cluster Perception in Scatterplots Using Topology

论文作者

Quadri, Ghulam Jilani, Rosen, Paul

论文摘要

散点图用于各种视觉分析任务,包括群集识别,并且在散点图上使用的视觉编码在簇的视觉分离水平上起着决定作用。对于可视化设计人员,优化视觉编码对于最大程度地提高数据的清晰度至关重要。这需要准确地对人类对聚类分离的看法进行建模,这仍然具有挑战性。我们提出了一个多阶段的用户研究,重点介绍了四个因素 - 簇的分布大小,点数,点的大小以及点的不透明度 - 影响散点图中的群集识别。从这些参数中,我们使用拓扑数据分析中的合并树数据结构构建了两个基于距离的模型和一个基于密度的模型。我们的分析表明,这些因素在感知的群集数量中起着重要作用,并且它验证了基于距离和基于密度的模型可以合理地估计用户观察到的簇数量。最后,我们演示了如何使用这些模型来优化现实世界数据的视觉编码。

Scatterplots are used for a variety of visual analytics tasks, including cluster identification, and the visual encodings used on a scatterplot play a deciding role on the level of visual separation of clusters. For visualization designers, optimizing the visual encodings is crucial to maximizing the clarity of data. This requires accurately modeling human perception of cluster separation, which remains challenging. We present a multi-stage user study focusing on four factors---distribution size of clusters, number of points, size of points, and opacity of points---that influence cluster identification in scatterplots. From these parameters, we have constructed two models, a distance-based model, and a density-based model, using the merge tree data structure from Topological Data Analysis. Our analysis demonstrates that these factors play an important role in the number of clusters perceived, and it verifies that the distance-based and density-based models can reasonably estimate the number of clusters a user observes. Finally, we demonstrate how these models can be used to optimize visual encodings on real-world data.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源