论文标题

潜在特征表示通过无监督的学习,以在巨大的电子显微镜图像量中进行模式发现

Latent Feature Representation via Unsupervised Learning for Pattern Discovery in Massive Electron Microscopy Image Volumes

论文作者

Huang, Gary B, Yang, Huei-Fang, Takemura, Shin-ya, Rivlin, Pat, Plaza, Stephen M

论文摘要

我们提出了一种促进新大数据集探索和分析的方法。特别是,我们给出了一种无监督的深度学习方法,以学习一种潜在表示,该表示可以捕获数据集中的语义相似性。核心思想是使用保留语义含义的数据增强,以生成元素的合成示例,其特征表示应彼此接近。 我们证明了我们的方法应用于纳米级电子显微镜数据的实用性,即使是相对较小的动物大脑也可能需要图像数据的核细胞。尽管有监督的方法可用于预测和识别已知的感兴趣模式,但数据的规模使得很难挖掘和分析未知的先验模式。我们展示了我们学习的表示形式以示例启用查询的能力,因此,如果科学家注意到数据中有趣的模式,则可以将它们与其他具有匹配模式的位置呈现。我们还证明,学习空间中数据的聚类与生物学上的差异相关。最后,我们引入了一个可视化工具和软件生态系统,以促进用户友好的交互式分析并发现有趣的生物学模式。简而言之,我们的工作为在诸如EM分析之类的领域中引起的大型数据集理解和发现开设了可能的新途径。

We propose a method to facilitate exploration and analysis of new large data sets. In particular, we give an unsupervised deep learning approach to learning a latent representation that captures semantic similarity in the data set. The core idea is to use data augmentations that preserve semantic meaning to generate synthetic examples of elements whose feature representations should be close to one another. We demonstrate the utility of our method applied to nano-scale electron microscopy data, where even relatively small portions of animal brains can require terabytes of image data. Although supervised methods can be used to predict and identify known patterns of interest, the scale of the data makes it difficult to mine and analyze patterns that are not known a priori. We show the ability of our learned representation to enable query by example, so that if a scientist notices an interesting pattern in the data, they can be presented with other locations with matching patterns. We also demonstrate that clustering of data in the learned space correlates with biologically-meaningful distinctions. Finally, we introduce a visualization tool and software ecosystem to facilitate user-friendly interactive analysis and uncover interesting biological patterns. In short, our work opens possible new avenues in understanding of and discovery in large data sets, arising in domains such as EM analysis.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源