论文标题

部分可观测时空混沌系统的无模型预测

Improved singing voice separation with chromagram-based pitch-aware remixing

论文作者

Yuan, Siyuan, Wang, Zhepei, Isik, Umut, Giri, Ritwik, Valin, Jean-Marc, Goodwin, Michael M., Krishnaswamy, Arvindh

论文摘要

唱歌的语音分离旨在将音乐分为人声和伴奏组成部分。该任务的主要限制之一是具有分开人声的培训数据有限。数据增强技术(例如随机源混合)已被证明可以更好地利用现有数据并轻度改善模型性能。我们提出了一种新型的数据增强技术,基于Chromagram的音调引起的混音,其中混合了高音调对齐的音乐段。通过在受监督和半监督的设置中进行受控实验,我们证明了带有音调吸引的混合的训练模型可显着提高测试信噪比(SDR)(SDR)

Singing voice separation aims to separate music into vocals and accompaniment components. One of the major constraints for the task is the limited amount of training data with separated vocals. Data augmentation techniques such as random source mixing have been shown to make better use of existing data and mildly improve model performance. We propose a novel data augmentation technique, chromagram-based pitch-aware remixing, where music segments with high pitch alignment are mixed. By performing controlled experiments in both supervised and semi-supervised settings, we demonstrate that training models with pitch-aware remixing significantly improves the test signal-to-distortion ratio (SDR)

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源