论文标题

连贯的VCSEL神经网络深度学习

Deep Learning with Coherent VCSEL Neural Networks

论文作者

Chen, Zaijun, Sludds, Alexander, Davis, Ronald, Christen, Ian, Bernstein, Liane, Heuser, Tobias, Heermeier, Niels, Lott, James A., Reitzenstein, Stephan, Hamerly, Ryan, Englund, Dirk

论文摘要

深度神经网络(DNNS)正在重塑信息处理领域。由于其指数增长挑战了现有的电子硬件,光学神经网络(ONNS)正在兴起,以高时钟速率,并行性和低损坏数据传输的方式处理光学域中的DNN任务。但是,为了探索ONN的潜力,有必要研究结合包括基质代数和非线性激活在内的主要DNN元素的全系统性能。现有对ONN的挑战是由于低电形转换效率,由于缺乏内线非线性而引起的较长的电源效率,由于设备足迹和通道串扰较大而引起的较长延迟,因此高能量消耗。在这里,我们在实验上展示了一个同时克服所有这些挑战的ONN系统。 We exploit neuron encoding with volume-manufactured micron-scale vertical-cavity surface-emitting laser (VCSEL) transmitter arrays that exhibit high EO conversion (<5 attojoule/symbol with $V_π$=4 mV), high operation bandwidth (up to 25 GS/s), and compact footprint (<0.01 mm$^2$ per device).光电乘法允许在Shot-Noise量子限制下进行低能矩阵操作。基于同伴检测的非线性可以通过瞬时响应进行非线性激活。每个操作(FJ/OP)和25 teraop/(mm $ $^2 \ cdot $ s)的全系统能效和计算密度达到7 femtojoules,均代表比最先进的数字计算机提高> 100倍的改善,并具有多种数量级的未来改进。除了神经网络推断之外,其快速体重更新的特征对于培训深度学习模型至关重要。我们的技术向大规模的光电处理器开辟了一条途径,以从数据中心到分散的边缘设备加速机器学习任务。

Deep neural networks (DNNs) are reshaping the field of information processing. With their exponential growth challenging existing electronic hardware, optical neural networks (ONNs) are emerging to process DNN tasks in the optical domain with high clock rates, parallelism and low-loss data transmission. However, to explore the potential of ONNs, it is necessary to investigate the full-system performance incorporating the major DNN elements, including matrix algebra and nonlinear activation. Existing challenges to ONNs are high energy consumption due to low electro-optic (EO) conversion efficiency, low compute density due to large device footprint and channel crosstalk, and long latency due to the lack of inline nonlinearity. Here we experimentally demonstrate an ONN system that simultaneously overcomes all these challenges. We exploit neuron encoding with volume-manufactured micron-scale vertical-cavity surface-emitting laser (VCSEL) transmitter arrays that exhibit high EO conversion (<5 attojoule/symbol with $V_π$=4 mV), high operation bandwidth (up to 25 GS/s), and compact footprint (<0.01 mm$^2$ per device). Photoelectric multiplication allows low-energy matrix operations at the shot-noise quantum limit. Homodyne detection-based nonlinearity enables nonlinear activation with instantaneous response. The full-system energy efficiency and compute density reach 7 femtojoules per operation (fJ/OP) and 25 TeraOP/(mm$^2\cdot$ s), both representing a >100-fold improvement over state-of-the-art digital computers, with substantially several more orders of magnitude for future improvement. Beyond neural network inference, its feature of rapid weight updating is crucial for training deep learning models. Our technique opens an avenue to large-scale optoelectronic processors to accelerate machine learning tasks from data centers to decentralized edge devices.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源