论文标题
使用卷积神经网络,转移学习以及粒子竞争与合作的视觉障碍辅助
Visually Impaired Aid using Convolutional Neural Networks, Transfer Learning, and Particle Competition and Cooperation
论文作者
论文摘要
导航和流动性是视力障碍者日常生活中面临的一些主要问题。计算机视觉的进展导致了一些导航系统的提议。但是,其中大多数需要昂贵和/或重型硬件。在本文中,我们建议使用卷积神经网络(CNN),转移学习和半监督学习(SSL)来构建针对视觉障碍辅助的框架。它的计算成本较低,因此可以在当前的智能手机上实施,而无需依靠任何其他设备。智能手机相机可用于自动为前方的路径拍照。然后,它们将立即分类,向用户提供几乎瞬时的反馈。我们还建议一个数据集来训练分类器,包括具有不同类型的光,地板和障碍物的室内和室外情况。通过在更大的数据集中预先训练的权重来评估许多不同的CNN架构作为特征提取器和分类器评估。基于图的SSL方法(称为粒子竞争与合作)也用于分类,可以在不重新培训基础网络的情况下合并用户的反馈。在最佳监督和SSL方案中,在拟议的数据集中实现了92 \%和80 \%分类精度。
Navigation and mobility are some of the major problems faced by visually impaired people in their daily lives. Advances in computer vision led to the proposal of some navigation systems. However, most of them require expensive and/or heavy hardware. In this paper we propose the use of convolutional neural networks (CNN), transfer learning, and semi-supervised learning (SSL) to build a framework aimed at the visually impaired aid. It has low computational costs and, therefore, may be implemented on current smartphones, without relying on any additional equipment. The smartphone camera can be used to automatically take pictures of the path ahead. Then, they will be immediately classified, providing almost instantaneous feedback to the user. We also propose a dataset to train the classifiers, including indoor and outdoor situations with different types of light, floor, and obstacles. Many different CNN architectures are evaluated as feature extractors and classifiers, by fine-tuning weights pre-trained on a much larger dataset. The graph-based SSL method, known as particle competition and cooperation, is also used for classification, allowing feedback from the user to be incorporated without retraining the underlying network. 92\% and 80\% classification accuracy is achieved in the proposed dataset in the best supervised and SSL scenarios, respectively.