卷积 - 注意到结构感知性能得分同步的神经框架

论文标题

卷积 - 注意到结构感知性能得分同步的神经框架

A Convolutional-Attentional Neural Framework for Structure-Aware Performance-Score Synchronization

论文作者

Agrawal, Ruchit, Wolff, Daniel, Dixon, Simon

论文摘要

性能得分同步是信号处理中不可或缺的任务，这需要在性能的音频记录和相应的乐谱之间生成准确的映射。传统的同步方法使用知识驱动和随机方法计算对准，并且通常无法很好地推广到不同的领域和模式。我们提出了一种用于结构感知性能得分同步的新型数据驱动方法。我们提出了一个卷积 - 注意建筑，该体系结构受到基于时间序列差异的自定义损失的训练。我们为音频到中型和音频到图像对齐任务进行实验，这些任务与不同的分数模式有关。我们通过消融研究来验证方法的有效性，并与最先进的对准方法进行比较。我们证明，我们的方法优于先前的同步方法，用于跨评分方式和声学条件的各种测试设置。我们的方法对性能和得分序列之间的结构差异也很强，这是标准比对方法的普遍限制。

Performance-score synchronization is an integral task in signal processing, which entails generating an accurate mapping between an audio recording of a performance and the corresponding musical score. Traditional synchronization methods compute alignment using knowledge-driven and stochastic approaches, and are typically unable to generalize well to different domains and modalities. We present a novel data-driven method for structure-aware performance-score synchronization. We propose a convolutional-attentional architecture trained with a custom loss based on time-series divergence. We conduct experiments for the audio-to-MIDI and audio-to-image alignment tasks pertained to different score modalities. We validate the effectiveness of our method via ablation studies and comparisons with state-of-the-art alignment approaches. We demonstrate that our approach outperforms previous synchronization methods for a variety of test settings across score modalities and acoustic conditions. Our method is also robust to structural differences between the performance and score sequences, which is a common limitation of standard alignment approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题