论文标题
X-媒介:早期帕金森氏病检测的新定量生物标志物
X-vectors: New Quantitative Biomarkers for Early Parkinson's Disease Detection from Speech
论文作者
论文摘要
许多文章都使用语音分析来检测帕金森氏病(PD),但很少有人专注于疾病的早期阶段和性别效应。在本文中,我们改编了最新的说话者识别系统,称为X-Vectors,以检测语音分析的早期PD阶段。 X向量是从深神经网络中提取的嵌入,当使用大量培训数据时,它们提供了可靠的说话者表示并改善说话者的识别。我们的目标是评估在早期PD检测的背景下,该技术是否表现优于更标准的分类器MFCC-GMM(MEL频率Cepstral系数 - 高斯混合模型),如果是这样,则在哪些条件下。我们用高质量的麦克风和他们自己的电话录制了221位法国人(包括最近被诊断出的PD受试者和健康对照组)。分别分析男人和女人,以具有更精确的模型并评估可能的性别效应。测试了一些实验和方法论方面,以分析其对分类性能的影响。我们评估了音频段持续时间,数据扩展,用于神经网络培训的数据集的类型,类型的语音任务和后端分析。 X-向量技术提供了比文本无关任务的MFCC-GMM更好的分类性能,并且似乎特别适合于女性的PD早期检测(7%至15%的提高)。对于两种记录类型(高质量的麦克风和电话),观察到了这一结果。
Many articles have used voice analysis to detect Parkinson's disease (PD), but few have focused on the early stages of the disease and the gender effect. In this article, we have adapted the latest speaker recognition system, called x-vectors, in order to detect an early stage of PD from voice analysis. X-vectors are embeddings extracted from a deep neural network, which provide robust speaker representations and improve speaker recognition when large amounts of training data are used. Our goal was to assess whether, in the context of early PD detection, this technique would outperform the more standard classifier MFCC-GMM (Mel-Frequency Cepstral Coefficients - Gaussian Mixture Model) and, if so, under which conditions. We recorded 221 French speakers (including recently diagnosed PD subjects and healthy controls) with a high-quality microphone and with their own telephone. Men and women were analyzed separately in order to have more precise models and to assess a possible gender effect. Several experimental and methodological aspects were tested in order to analyze their impacts on classification performance. We assessed the impact of audio segment duration, data augmentation, type of dataset used for the neural network training, kind of speech tasks, and back-end analyses. X-vectors technique provided better classification performances than MFCC-GMM for text-independent tasks, and seemed to be particularly suited for the early detection of PD in women (7 to 15% improvement). This result was observed for both recording types (high-quality microphone and telephone).