论文标题

具有生物特征验证和高级语音互动能力的智能扬声器设计和实施

Smart speaker design and implementation with biometric authentication and advanced voice interaction capability

论文作者

Sudharsan, Bharath, Corcoran, Peter, Ali, Muhammad Intizar

论文摘要

半导体技术的进步降低了尺寸和成本,同时提高了芯片组的性能和能力。此外,AI框架和库中的进步还带来了在消费者IoT设备的资源受限边缘容纳更多AI的可能性。如今,传感器是我们环境不可或缺的一部分,它为构建智能应用程序提供了连续的数据流。一个示例可能是具有多个互连设备的智能家庭场景。在这样的智能环境中,为了方便并快速访问基于Web的服务以及日历,笔记,电子邮件,提醒,银行业务等,用户将第三方商店的第三方技能或技能链接到他们的智能扬声器。此外,在当前的智能家居场景中,通过定制技能添加自定义技能,将几种智能家居产品,例如智能安全摄像头,视频门铃,智能碳插头,一氧化碳监视器和智能门锁等。由于智能扬声器通过智能扬声器用户的帐户链接到此类服务和设备。可以通过语音命令进行物理访问智能扬声器的任何人使用它们。如果这样做,数据隐私,家庭安全和用户的其他方面都会受到损害。最近推出的是Tensor Cam的AI相机Toshiba的Symbio,Facebook的门户是具有AI功能的摄像头智能扬声器。尽管它们具有摄像头,但除了呼唤尾流以外,它们没有一个身份验证方案。本文概述了由于缺乏身份验证方案而面临的智能扬声器用户面临的网络安全风险,并讨论了最先进的摄像头,基于麦克风阵列的最先进的现代Alexa智能扬声器原型,以解决这些风险。

Advancements in semiconductor technology have reduced dimensions and cost while improving the performance and capacity of chipsets. In addition, advancement in the AI frameworks and libraries brings possibilities to accommodate more AI at the resource-constrained edge of consumer IoT devices. Sensors are nowadays an integral part of our environment which provide continuous data streams to build intelligent applications. An example could be a smart home scenario with multiple interconnected devices. In such smart environments, for convenience and quick access to web-based service and personal information such as calendars, notes, emails, reminders, banking, etc, users link third-party skills or skills from the Amazon store to their smart speakers. Also, in current smart home scenarios, several smart home products such as smart security cameras, video doorbells, smart plugs, smart carbon monoxide monitors, and smart door locks, etc. are interlinked to a modern smart speaker via means of custom skill addition. Since smart speakers are linked to such services and devices via the smart speaker user's account. They can be used by anyone with physical access to the smart speaker via voice commands. If done so, the data privacy, home security and other aspects of the user get compromised. Recently launched, Tensor Cam's AI Camera, Toshiba's Symbio, Facebook's Portal are camera-enabled smart speakers with AI functionalities. Although they are camera-enabled, yet they do not have an authentication scheme in addition to calling out the wake-word. This paper provides an overview of cybersecurity risks faced by smart speaker users due to lack of authentication scheme and discusses the development of a state-of-the-art camera-enabled, microphone array-based modern Alexa smart speaker prototype to address these risks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源