论文标题

GEVITREC:通过使用域特异性患病率可视化设计空间通过建议进行数据侦察

GEViTRec: Data Reconnaissance Through Recommendation Using a Domain-Specific Prevalence Visualization Design Space

论文作者

Crisan, Anamaria, Fisher, Shannah, Gardy, Jennifer L., Munzner, Tamara

论文摘要

基因组流行病学(Genepi)是公共卫生的一个分支,它使用了许多不同的数据类型,包括表格,网络,基因组和地理,以识别和包含致命疾病的暴发。由于数据的数量和多样性,Genepi领域专家进行数据侦察是一项挑战。也就是说,概述了他们拥有的数据,并对其质量,完整性和适用性进行评估。我们通过自动可视化建议Gevitrec提出了一种用于数据侦察的算法。我们的方法处理各种数据集类型,并自动生成图表的协调组合,与主要关注表格数据集的Singleton视觉编码的现有系统相比。我们通过分析非数字属性字段来自动检测多个输入数据集的链接,从而创建一个实体图,我们在其中分析和秩路。对于每个高级路径,我们使用共享字段之间的空间和颜色对齐方式指定图表组合,使用逐渐的绑定方法将Singleton图表的初始部分规范转换为完整的规格,以始终如一。我们方法的一个新方面是它结合了域 - 不足的元素与特定于域的信息,这些元素通过特定于域特定的可视化流行率设计空间捕获。我们的实现应用于埃博拉疫情中的合成数据和真实数据。我们将Gevitrec的输出与以前的可视化建议系统的产生以及从业人员使用的手动设计可视化进行了比较。我们与十位Genepi专家进行了形成性评估,以评估结果的相关性和解释性。

Genomic Epidemiology (genEpi) is a branch of public health that uses many different data types including tabular, network, genomic, and geographic, to identify and contain outbreaks of deadly diseases. Due to the volume and variety of data, it is challenging for genEpi domain experts to conduct data reconnaissance; that is, have an overview of the data they have and make assessments toward its quality, completeness, and suitability. We present an algorithm for data reconnaissance through automatic visualization recommendation, GEViTRec. Our approach handles a broad variety of dataset types and automatically generates coordinated combinations of charts, in contrast to existing systems that primarily focus on singleton visual encodings of tabular datasets. We automatically detect linkages across multiple input datasets by analyzing non-numeric attribute fields, creating an entity graph within which we analyze and rank paths. For each high-ranking path, we specify chart combinations with spatial and color alignments between shared fields, using a gradual binding approach to transform initial partial specifications of singleton charts to complete specifications that are aligned and oriented consistently. A novel aspect of our approach is its combination of domain-agnostic elements with domain-specific information that is captured through a domain-specific visualization prevalence design space. Our implementation is applied to both synthetic data and real data from an Ebola outbreak. We compare GEViTRec's output to what previous visualization recommendation systems would generate, and to manually crafted visualizations used by practitioners. We conducted formative evaluations with ten genEpi experts to assess the relevance and interpretability of our results.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源