论文标题
通过融合Guzheng Performance Note发作信息来播放技术检测
Playing Technique Detection by Fusing Note Onset Information in Guzheng Performance
论文作者
论文摘要
古本(Guzheng)是一种具有多种演奏技巧的中国传统乐器。乐器演奏技巧(IPT)在音乐表演中起着重要作用。但是,大多数现有的IPT检测作品都显示出可变长度音频的效率低,并且在概括方面没有保证,因为它们依靠单个声音库进行培训和测试。在这项研究中,我们建议使用可应用于可变长度音频的完全卷积网络提出了一个端到端的古兴游戏检测系统。由于每种古本(Guzheng)演奏技术都应用于音符,因此对专用的发作检测器进行了训练,以将音频分为几个音符,并将其预测与框架IPT的预测融合在一起。在融合过程中,我们在每个音符内部添加IPT预测框架,并在每个音符中获得最高概率的IPT作为该注释的最终输出。我们从多个声音银行创建了一个名为GZ_ISOTECH的新数据集,并为Guzheng性能分析创建了现实世界录制。我们的方法在框架级的准确性和80.76%的笔记级F1得分方面达到了87.97%,超过了现有的作品,这表明我们提出的方法在IPT检测中的有效性。
The Guzheng is a kind of traditional Chinese instruments with diverse playing techniques. Instrument playing techniques (IPT) play an important role in musical performance. However, most of the existing works for IPT detection show low efficiency for variable-length audio and provide no assurance in the generalization as they rely on a single sound bank for training and testing. In this study, we propose an end-to-end Guzheng playing technique detection system using Fully Convolutional Networks that can be applied to variable-length audio. Because each Guzheng playing technique is applied to a note, a dedicated onset detector is trained to divide an audio into several notes and its predictions are fused with frame-wise IPT predictions. During fusion, we add the IPT predictions frame by frame inside each note and get the IPT with the highest probability within each note as the final output of that note. We create a new dataset named GZ_IsoTech from multiple sound banks and real-world recordings for Guzheng performance analysis. Our approach achieves 87.97% in frame-level accuracy and 80.76% in note-level F1-score, outperforming existing works by a large margin, which indicates the effectiveness of our proposed method in IPT detection.