论文标题
音乐源分离的特征信息潜在空间正则
Feature-informed Latent Space Regularization for Music Source Separation
论文作者
论文摘要
多次通过在输入中添加功能或通过在多任务学习方案中添加学习目标来研究其他侧面信息以改善音乐源分离。但是,这些方法需要在培训和推理过程中进行其他注释,例如音乐得分,乐器标签等。用于源分离的可用数据集通常不提供这些额外的注释。在这项工作中,我们探讨了转移学习策略,以将VGGISH功能与最先进的源分离模型结合在一起; VGGISH功能已知是音频内容的非常凝结的表示,并且已成功地用于许多MIR任务中。我们介绍了三种合并特征的方法,包括两种潜在空间正则化方法和一种幼稚的串联方法。实验结果表明,我们提出的方法改善了音乐源分离的几个评估指标。
The integration of additional side information to improve music source separation has been investigated numerous times, e.g., by adding features to the input or by adding learning targets in a multi-task learning scenario. These approaches, however, require additional annotations such as musical scores, instrument labels, etc. in training and possibly during inference. The available datasets for source separation do not usually provide these additional annotations. In this work, we explore transfer learning strategies to incorporate VGGish features with a state-of-the-art source separation model; VGGish features are known to be a very condensed representation of audio content and have been successfully used in many MIR tasks. We introduce three approaches to incorporate the features, including two latent space regularization methods and one naive concatenation method. Experimental results show that our proposed approaches improve several evaluation metrics for music source separation.