通过转移学习，使用深层卷积神经网络进行音频欺骗验证

论文标题

通过转移学习，使用深层卷积神经网络进行音频欺骗验证

Audio Spoofing Verification using Deep Convolutional Neural Networks by Transfer Learning

论文作者

P, Rahul T, Aravind, P R, C, Ranjith, Nechiyil, Usamath, Paramparambath, Nandakumar

论文摘要

如今，自动扬声器验证系统正在越来越受欢迎。欺骗攻击使这些系统变得容易受到伤害，这是主要的关注。一些欺骗攻击等诸如重播攻击之类的攻击更容易实施，但很难检测到，因此需要适当的对策。在本文中，我们提出了一个基于深度跨跨神经网络的语音分类器，以检测欺骗攻击。我们提出的方法通过深层剩余学习（Resnet-34体系结构的适应），使用频率量表（MEL-SPECTROGRAM）功率频谱密度（MEL-SPECTROGRAM）的声学时频表示。使用单个模型系统，我们在开发方面达到了相等的错误率（EER），逻辑访问方案的评估数据集为5.9056％，而开发的逻辑访问方案的评估数据集为5.87％，在ASVSpoof 2019的物理访问情况的评估数据集中，开发的误差率（EER）为5.87％。

Automatic Speaker Verification systems are gaining popularity these days; spoofing attacks are of prime concern as they make these systems vulnerable. Some spoofing attacks like Replay attacks are easier to implement but are very hard to detect thus creating the need for suitable countermeasures. In this paper, we propose a speech classifier based on deep-convolutional neural network to detect spoofing attacks. Our proposed methodology uses acoustic time-frequency representation of power spectral densities on Mel frequency scale (Mel-spectrogram), via deep residual learning (an adaptation of ResNet-34 architecture). Using a single model system, we have achieved an equal error rate (EER) of 0.9056% on the development and 5.32% on the evaluation dataset of logical access scenario and an equal error rate (EER) of 5.87% on the development and 5.74% on the evaluation dataset of physical access scenario of ASVspoof 2019.

下载PDF全文

下载文献需遵守相关版权规定

论文标题