论文标题

通过回归发作和偏移时间,用踏板的高分辨率钢琴转录

High-resolution Piano Transcription with Pedals by Regressing Onset and Offset Times

论文作者

Kong, Qiuqiang, Li, Bochen, Song, Xuchen, Wan, Yuan, Wang, Yuxuan

论文摘要

自动音乐转录(AMT)是将录音转录为符号表示形式的任务。最近,基于神经网络的方法已应用于AMT,并实现了最先进的结果。但是,许多以前的系统仅检测到框架的发作和偏移,因此转录分辨率仅限于帧跳跃尺寸。缺乏关于使用不同策略来编码发作和偏移目标进行培训的研究。此外,以前的AMT系统对音频记录的未对准发作和偏移标签很敏感。此外,关于大规模数据集的维持踏板转录的研究有限。在本文中,我们提出了一个高分辨率AMT系统,该系统通过回归精确发作和钢琴音符的抵消时间来训练。在推断时,我们提出了一种算法来分析计算钢琴音符和踏板事件的精确发作和抵消时间。我们表明,与以前的系统相比,我们的AMT系统对未对准的发作和偏移标签是可靠的。我们提出的系统在Maestro数据集上达到了96.72%的发作F1,表现优于以前的ONETS和帧系统94.80%。我们的系统达到了91.86 \%的踏板发作F1分数,这是Maestro数据集中的第一个基准结果。我们已经在https://github.com/bytedance/piano_transcription上发布了我们工作的源代码和检查点。

Automatic music transcription (AMT) is the task of transcribing audio recordings into symbolic representations. Recently, neural network-based methods have been applied to AMT, and have achieved state-of-the-art results. However, many previous systems only detect the onset and offset of notes frame-wise, so the transcription resolution is limited to the frame hop size. There is a lack of research on using different strategies to encode onset and offset targets for training. In addition, previous AMT systems are sensitive to the misaligned onset and offset labels of audio recordings. Furthermore, there are limited researches on sustain pedal transcription on large-scale datasets. In this article, we propose a high-resolution AMT system trained by regressing precise onset and offset times of piano notes. At inference, we propose an algorithm to analytically calculate the precise onset and offset times of piano notes and pedal events. We show that our AMT system is robust to the misaligned onset and offset labels compared to previous systems. Our proposed system achieves an onset F1 of 96.72% on the MAESTRO dataset, outperforming previous onsets and frames system of 94.80%. Our system achieves a pedal onset F1 score of 91.86\%, which is the first benchmark result on the MAESTRO dataset. We have released the source code and checkpoints of our work at https://github.com/bytedance/piano_transcription.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源