论文标题
Popmag:流行音乐伴奏
PopMAG: Pop Music Accompaniment Generation
论文作者
论文摘要
在流行音乐中,伴奏通常由鼓,贝斯,弦乐和吉他等多种乐器(曲目)演奏,并且可以通过与其旋律结合在一起来使歌曲更具表现力和传染性。以前的作品通常会单独生成多个曲目,而来自不同曲目的音乐音符不明确地彼此依赖,这会损害和谐建模。为了改善和谐,在本文中,我们提出了一种新型的多轨MIDI表示(Mumidi),该表示可以同时以单个序列同时产生多轨生成,并明确对不同轨道的音符的依赖性进行了建模。不幸的是,尽管这大大改善了和谐,但它扩大了序列的长度,并带来了长期音乐建模的新挑战。我们进一步介绍了两种新技术来应对这一挑战:1)我们在一个步骤而不是多个步骤中对多个音符属性(例如,音调,持续时间,速度)建模,这可以缩短mumidi序列的长度。 2)我们介绍了多个长篇小说作为记忆,以捕捉音乐中的长期依赖性。我们将流行音乐伴奏的系统称为Popmag。我们在多个数据集(LMD,Freemidi和CPMD,私人数据集的中文流行歌曲)上评估了POPMAG。结果证明了POPMAG对多轨和谐建模和长期背景建模的有效性。具体来说,Popmag分别与LMD,Freemidi和CPMD数据集的地面真相音乐作品进行比较时,Popmag赢得了42 \%/38 \%/40 \%的投票,并且在很大程度上优于其他最先进的音乐伴奏伴奏的音乐生成模型,以及在主体和客观的学术方面表现出多轨MIDI表示。
In pop music, accompaniments are usually played by multiple instruments (tracks) such as drum, bass, string and guitar, and can make a song more expressive and contagious by arranging together with its melody. Previous works usually generate multiple tracks separately and the music notes from different tracks not explicitly depend on each other, which hurts the harmony modeling. To improve harmony, in this paper, we propose a novel MUlti-track MIDI representation (MuMIDI), which enables simultaneous multi-track generation in a single sequence and explicitly models the dependency of the notes from different tracks. While this greatly improves harmony, unfortunately, it enlarges the sequence length and brings the new challenge of long-term music modeling. We further introduce two new techniques to address this challenge: 1) We model multiple note attributes (e.g., pitch, duration, velocity) of a musical note in one step instead of multiple steps, which can shorten the length of a MuMIDI sequence. 2) We introduce extra long-context as memory to capture long-term dependency in music. We call our system for pop music accompaniment generation as PopMAG. We evaluate PopMAG on multiple datasets (LMD, FreeMidi and CPMD, a private dataset of Chinese pop songs) with both subjective and objective metrics. The results demonstrate the effectiveness of PopMAG for multi-track harmony modeling and long-term context modeling. Specifically, PopMAG wins 42\%/38\%/40\% votes when comparing with ground truth musical pieces on LMD, FreeMidi and CPMD datasets respectively and largely outperforms other state-of-the-art music accompaniment generation models and multi-track MIDI representations in terms of subjective and objective metrics.