论文标题
基于视频的人群计数的时空专注网络
A Spatio-Temporal Attentive Network for Video-Based Crowd Counting
论文作者
论文摘要
由于监视摄像头网络的普遍存在,从图像中计算的自动人士最近引起了现代智能城市城市监测的注意。当前的计算机视觉技术依赖于基于深度学习的算法,这些算法估算了静止图像中的行人密度。只有一堆作品利用了视频序列中的时间一致性。在这项工作中,我们提出了一个时空的细心神经网络,以估计监视视频中的行人数量。通过利用连续帧之间的时间相关性,我们将最新的计数误差降低了5%,而定位误差在广泛使用的FDST基准上降低了7.5%。
Automatic people counting from images has recently drawn attention for urban monitoring in modern Smart Cities due to the ubiquity of surveillance camera networks. Current computer vision techniques rely on deep learning-based algorithms that estimate pedestrian densities in still, individual images. Only a bunch of works take advantage of temporal consistency in video sequences. In this work, we propose a spatio-temporal attentive neural network to estimate the number of pedestrians from surveillance videos. By taking advantage of the temporal correlation between consecutive frames, we lowered state-of-the-art count error by 5% and localization error by 7.5% on the widely-used FDST benchmark.