论文标题
麻省理工学院语音名称系统
The MIT Voice Name System
论文作者
论文摘要
此RFC白皮书总结了我们在MIT语音名称系统(VNS)和Huey上的进展。 VNS的名称和功能与DNS相似,是一个保留和使用“唤醒单词”来激活人工智能(AI)设备的系统。就像您可以说“ Hey Siri”以激活Apple的个人助理一样,我们建议使用智能扬声器和其他设备中的VNS根据“关闭”,“ Open Grocery Shopping List”或“ 271,“开始我的计算机视觉”类别的闪存卡评论,以唤醒请求来唤醒请求”。我们还介绍了Huey,这是一种明确的自然语言,可以与AI设备互动。我们旨在将语音交互标准化,以与其他系统(例如电话号码)类似的通用范围,并采用全球范围的商定方法来分配和使用数字,或具有标准命名系统的Internet DNS,它帮助蓬勃发展的流行服务,包括全球范围内 - 网络,FTP和电子邮件。就像这些标准是“中性”的一样,我们也旨在赋予VNS具有“唤醒中立性”,以便每个参与者都可以发展自己的数字声音。我们专注于声音,作为与任何物联网对象进行对话的起点,并简要解释如何将VNS扩展到其他AI技术,从而实现人与智的对话(真正的机器对机器),包括计算机视觉或神经界面。我们还简要考虑了更广泛的标准,MIT Open AI(MOA),包括参考体系结构,作为开发具有标准“唤醒单词”的一般对话性对话性商务基础架构的起点,NLP命令,诸如“购物清单”或“闪存卡”诸如PI或271的人物之类的信息。语音样本。
This RFC white Paper summarizes our progress on the MIT Voice Name System (VNS) and Huey. The VNS, similar in name and function to the DNS, is a system to reserve and use "wake words" to activate Artificial Intelligence (AI) devices. Just like you can say "Hey Siri" to activate Apple's personal assistant, we propose using the VNS in smart speakers and other devices to route wake requests based on commands such as "turn off", "open grocery shopping list" or "271, start flash card review of my computer vision class". We also introduce Huey, an unambiguous Natural Language to interact with AI devices. We aim to standardize voice interactions to a universal reach similar to that of other systems such as phone numbering, with an agreed world-wide approach to assign and use numbers, or the Internet's DNS, with a standard naming system, that has helped flourish popular services including the World-Wide-Web, FTP, and email. Just like these standards are "neutral", we also aim to endow the VNS with "wake neutrality" so that each participant can develop its own digital voice. We focus on voice as a starting point to talk to any IoT object and explain briefly how the VNS may be expanded to other AI technologies enabling person-to-machine conversations (really machine-to-machine), including computer vision or neural interfaces. We also describe briefly considerations for a broader set of standards, MIT Open AI (MOA), including a reference architecture to serve as a starting point for the development of a general conversational commerce infrastructure that has standard "Wake Words", NLP commands such as "Shopping Lists" or "Flash Card Reviews", and personalities such as Pi or 271. Privacy and security are key elements considered because of speech-to-text errors and the amount of personal information contained in a voice sample.