通过人类的内在凝视转移成本来改善显着性模型对下一个固定的预测

论文标题

通过人类的内在凝视转移成本来改善显着性模型对下一个固定的预测

Improving saliency models' predictions of the next fixation with humans' intrinsic cost of gaze shifts

论文作者

Kadner, Florian, Thomas, Tobias, Hoppe, David, Rothkopf, Constantin A.

论文摘要

图像区域的人体优先级可以以显着图的时间不变方式建模，也可以使用扫描路径模型进行依次建模。但是，尽管两种类型的模型在几个基准和数据集上都稳步改善，但预测人类凝视仍然存在很大的差距。在这里，我们利用最近的两个发展来减少这一差距：理论分析建立一个原则性框架，以预测下一个凝视目标和对凝视开关的人为成本的经验测量，而与图像内容无关。我们在顺序决策的框架中介绍了一种算法，该算法将任何静态显着性映射转换为一系列动态历史依赖的值映射序列，在每个注视转移之后都会重新计算。这些地图基于1）任意显着性模型提供的显着图，2）最近测量的人类成本函数量化了眼动的大小和方向的偏好，以及3）一个顺序的探索奖金，随后的每个凝视转移随着每个后续的凝视而变化。该探索奖金的空间范围和时间衰减的参数是从人类凝视数据中估计的。这三个组件的相对贡献在MIT1003数据集上优化了NSS得分，并且足以显着超过NSS上的下一个凝视目标的预测，并且在三个图像数据集上，在NSS上的下一个凝视目标和AUC分数的五个状态显着性模型。因此，我们提供了人类凝视偏好的实施，可用于改善任意显着性模型的“对人类对人类”的下一个凝视目标的预测。

The human prioritization of image regions can be modeled in a time invariant fashion with saliency maps or sequentially with scanpath models. However, while both types of models have steadily improved on several benchmarks and datasets, there is still a considerable gap in predicting human gaze. Here, we leverage two recent developments to reduce this gap: theoretical analyses establishing a principled framework for predicting the next gaze target and the empirical measurement of the human cost for gaze switches independently of image content. We introduce an algorithm in the framework of sequential decision making, which converts any static saliency map into a sequence of dynamic history-dependent value maps, which are recomputed after each gaze shift. These maps are based on 1) a saliency map provided by an arbitrary saliency model, 2) the recently measured human cost function quantifying preferences in magnitude and direction of eye movements, and 3) a sequential exploration bonus, which changes with each subsequent gaze shift. The parameters of the spatial extent and temporal decay of this exploration bonus are estimated from human gaze data. The relative contributions of these three components were optimized on the MIT1003 dataset for the NSS score and are sufficient to significantly outperform predictions of the next gaze target on NSS and AUC scores for five state of the art saliency models on three image data sets. Thus, we provide an implementation of human gaze preferences, which can be used to improve arbitrary saliency models' predictions of humans' next gaze targets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题