论文标题
音乐素描网:通过音调和节奏的分解表示音乐生成可控的音乐
Music SketchNet: Controllable Music Generation via Factorized Representations of Pitch and Rhythm
论文作者
论文摘要
与自动图像完成系统进行类比,我们提出了音乐Sketchnet,这是一个神经网络框架,允许用户指定部分音乐创意指导自动音乐的生成。我们专注于在不完整的单声音乐剧中生成缺失的措施,以周围的环境为条件,并在用户指定的音调和节奏剪辑的指导下。首先,我们引入了SketchVae,这是一种新型的变异自动编码器,可显式分解节奏和俯仰轮廓以构成我们提出的模型的基础。然后,我们介绍了两个判别架构:SketchInpainter和Sketch Connector,它们结合执行指导音乐的完成,填写在周围环境和用户指定的sippets条件下进行的缺失措施的表示形式。我们在爱尔兰民间音乐的标准数据集上评估了SketchNet,并与最近作品的模型进行了比较。当用于音乐完成时,我们的方法在客观指标和主观听力测试方面都超过了最先进的方法。最后,我们证明我们的模型可以在生成过程中成功合并用户指定的片段。
Drawing an analogy with automatic image completion systems, we propose Music SketchNet, a neural network framework that allows users to specify partial musical ideas guiding automatic music generation. We focus on generating the missing measures in incomplete monophonic musical pieces, conditioned on surrounding context, and optionally guided by user-specified pitch and rhythm snippets. First, we introduce SketchVAE, a novel variational autoencoder that explicitly factorizes rhythm and pitch contour to form the basis of our proposed model. Then we introduce two discriminative architectures, SketchInpainter and SketchConnector, that in conjunction perform the guided music completion, filling in representations for the missing measures conditioned on surrounding context and user-specified snippets. We evaluate SketchNet on a standard dataset of Irish folk music and compare with models from recent works. When used for music completion, our approach outperforms the state-of-the-art both in terms of objective metrics and subjective listening tests. Finally, we demonstrate that our model can successfully incorporate user-specified snippets during the generation process.