论文标题
使用近似合成的量子神经网络中的知识蒸馏
Knowledge Distillation in Quantum Neural Network using Approximate Synthesis
论文作者
论文摘要
最近关于特定机器学习(ML)任务的量子神经网络(QNN)的潜在优势的断言引发了大量应用研究人员的好奇心。参数化量子电路(PQC)是QNN的主要构建块,由几层单位旋转和多Qubit纠缠操作组成。通常未知的特定ML任务的PQC层数量最佳。较大的网络通常可以在无噪声模拟中提供更好的性能。但是,与较浅的网络相比,它在硬件上的性能可能很差。由于噪声量在量子设备之间变化,因此PQC的最佳深度可能会有很大差异。此外,为PQC选择的门可能适用于一种类型的硬件,但由于汇编开销而不适合另一种硬件。这使得很难将QNN设计推广到各种硬件和噪声水平。另一种方法是构建和训练针对每个硬件的多个QNN模型,这可能很昂贵。为了解决这些问题,我们使用近似合成介绍了QNN中知识蒸馏的概念。所提出的方法将创建一个新的QNN网络,其中(i)减少了层的层或(ii)不同的门集,而无需从头开始训练它。训练新网络以几个时期可以弥补近似误差造成的损失。通过经验分析,我们证明了电路层的降低〜71.4%,并且在噪声下仍然可以提高精度约为16.2%。
Recent assertions of a potential advantage of Quantum Neural Network (QNN) for specific Machine Learning (ML) tasks have sparked the curiosity of a sizable number of application researchers. The parameterized quantum circuit (PQC), a major building block of a QNN, consists of several layers of single-qubit rotations and multi-qubit entanglement operations. The optimum number of PQC layers for a particular ML task is generally unknown. A larger network often provides better performance in noiseless simulations. However, it may perform poorly on hardware compared to a shallower network. Because the amount of noise varies amongst quantum devices, the optimal depth of PQC can vary significantly. Additionally, the gates chosen for the PQC may be suitable for one type of hardware but not for another due to compilation overhead. This makes it difficult to generalize a QNN design to wide range of hardware and noise levels. An alternate approach is to build and train multiple QNN models targeted for each hardware which can be expensive. To circumvent these issues, we introduce the concept of knowledge distillation in QNN using approximate synthesis. The proposed approach will create a new QNN network with (i) a reduced number of layers or (ii) a different gate set without having to train it from scratch. Training the new network for a few epochs can compensate for the loss caused by approximation error. Through empirical analysis, we demonstrate ~71.4% reduction in circuit layers, and still achieve ~16.2% better accuracy under noise.