论文标题
从本地预测中无监督的功能和对象边界学习
Unsupervised learning of features and object boundaries from local prediction
论文作者
论文摘要
视觉系统必须学习哪些功能可以从图像中提取以及如何将位置分组到(原始)对象中。尽管将这两个方面分开处理,尽管可预测性被讨论为两者的提示。为了将特征和边界纳入同一模型,我们将特征图的特征图层建模,其中包括成对的马尔可夫随机场模型,其中每个因子与附加的二进制变量配对,该变量可开关或关闭。使用两个对比的学习目标之一,我们可以从图像中从图像中学习马尔可夫随机字段因子的特征和参数,而无需进一步的监督信号。基于此损失的浅神经网络学到的特征是局部平均值,对手颜色和类似Gabor的条纹图案。此外,我们可以通过推断开关变量来推断位置之间的连接性。从这种连通性推断出的轮廓在伯克利细分数据库(BSDS500)上表现良好,而无需进行轮廓培训。因此,跨太空辅助分段和特征学习的计算预测,并且经过培训以优化这些预测的模型显示与人类视觉系统的相似之处。我们推测,视网膜视觉皮层可以通过横向连接在空间上实现此类预测。
A visual system has to learn both which features to extract from images and how to group locations into (proto-)objects. Those two aspects are usually dealt with separately, although predictability is discussed as a cue for both. To incorporate features and boundaries into the same model, we model a layer of feature maps with a pairwise Markov random field model in which each factor is paired with an additional binary variable, which switches the factor on or off. Using one of two contrastive learning objectives, we can learn both the features and the parameters of the Markov random field factors from images without further supervision signals. The features learned by shallow neural networks based on this loss are local averages, opponent colors, and Gabor-like stripe patterns. Furthermore, we can infer connectivity between locations by inferring the switch variables. Contours inferred from this connectivity perform quite well on the Berkeley segmentation database (BSDS500) without any training on contours. Thus, computing predictions across space aids both segmentation and feature learning, and models trained to optimize these predictions show similarities to the human visual system. We speculate that retinotopic visual cortex might implement such predictions over space through lateral connections.