用户识别：多用户视觉辅助通信的关键推动器

论文标题

用户识别：多用户视觉辅助通信的关键推动器

User Identification: A Key Enabler for Multi-User Vision-Aided Communications

论文作者

Charan, Gouranga, Alkhateeb, Ahmed

论文摘要

视觉辅助的无线通信吸引了越来越多的兴趣，并在各种无线通信应用中找到新的用例。这些视觉辅助通信框架利用了在基础架构或移动设备上安装的摄像机捕获的视觉数据，通过使用深度学习以及计算机视觉和视觉场景的理解中的深入学习和进步来构建有关通信环境的一些感知。先前的工作已经调查了各种问题，例如毫米波（MMWave）系统中的视觉辅助束，阻塞和交接预测以及大量MIMO系统中的视觉协方差预测。但是，这项先前的工作集中在相机前面带有单个对象（用户）的方案上。在本文中，我们将\ textIt {用户标识}任务定义为现实的视觉通信系统的关键推动器，该系统可以在拥挤的方案中运行并支持多用户应用程序。用户识别任务的目的是从视觉场景中的其他候选对象（干扰物）识别目标通信用户。我们开发机器学习模型，以处理一个帧或一系列视觉和无线数据帧，以有效地识别目标用户在视觉/通信环境中。使用基于现实世界测量值的大型多模式感和通信数据集，DeepSense 6G，我们表明，在现实设置中，开发的方法可以成功地识别具有超过97美元$ \％$精度的目标用户。这为将视觉辅助的无线通信应用程序扩展到现实世界的场景和实际部署铺平了道路。

Vision-aided wireless communication is attracting increasing interest and finding new use cases in various wireless communication applications. These vision-aided communication frameworks leverage visual data captured, for example, by cameras installed at the infrastructure or mobile devices to construct some perception about the communication environment through the use of deep learning and advances in computer vision and visual scene understanding. Prior work has investigated various problems such as vision-aided beam, blockage, and hand-off prediction in millimeter wave (mmWave) systems and vision-aided covariance prediction in massive MIMO systems. This prior work, however, has focused on scenarios with a single object (user) in front of the camera. In this paper, we define the \textit{user identification} task as a key enabler for realistic vision-aided communication systems that can operate in crowded scenarios and support multi-user applications. The objective of the user identification task is to identify the target communication user from the other candidate objects (distractors) in the visual scene. We develop machine learning models that process either one frame or a sequence of frames of visual and wireless data to efficiently identify the target user in the visual/communication environment. Using the large-scale multi-modal sense and communication dataset, DeepSense 6G, which is based on real-world measurements, we show that the developed approaches can successfully identify the target users with more than 97$\%$ accuracy in realistic settings. This paves the way for scaling the vision-aided wireless communication applications to real-world scenarios and practical deployments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题