探索传统的机器学习以识别病理学听诊

论文标题

探索传统的机器学习以识别病理学听诊

Exploring traditional machine learning for identification of pathological auscultations

论文作者

Razvadauskas, Haroldas, Vaiciukynas, Evaldas, Buskus, Kazimieras, Drukteinis, Lukas, Arlauskas, Lukas, Sadauskas, Saulius, Naudziunas, Albinas

论文摘要

如今，数据收集在各个领域有所改善，医疗领域也不例外。由于数字听诊器的进步和可用性，听诊是医生的重要诊断技术，非常适合机器学习的应用。由于执行了大量的听诊，数据的可用性为对声音的更有效分析提供了机会，即使专家之间的预后准确性也仍然很低。在这项研究中，在各种机器学习方案中使用了45例患者的数字6通道听诊，目的是区分正常和异常的肺部声音。使用Python库冲浪板提取了音频功能（例如基本频率F0-4，响度，HNR，DFA以及对数能，RMS和MFCC的描述性统计）。窗口和特征聚合和串联策略用于在无监督的（公平的森林）和受监督的（随机森林）机器学习设置中为基于树的合奏模型准备数据。使用9倍分层的交叉验证重复进行了30次进行评估。测试了对受试者的平均输出的决策融合，并发现有用。监督模型比无监督的模型具有一致的优势，在基于侧面的检测中，平均AUC ROC为0.691（准确性为71.11％，Kappa 0.416，F1分数0.771），平均AUC ROC为0.721（准确性为68.89％，68.89％，Kappa 0.371，F1-SCERE 0.650）。

Today, data collection has improved in various areas, and the medical domain is no exception. Auscultation, as an important diagnostic technique for physicians, due to the progress and availability of digital stethoscopes, lends itself well to applications of machine learning. Due to the large number of auscultations performed, the availability of data opens up an opportunity for more effective analysis of sounds where prognostic accuracy even among experts remains low. In this study, digital 6-channel auscultations of 45 patients were used in various machine learning scenarios, with the aim of distinguishing between normal and anomalous pulmonary sounds. Audio features (such as fundamental frequencies F0-4, loudness, HNR, DFA, as well as descriptive statistics of log energy, RMS and MFCC) were extracted using the Python library Surfboard. Windowing and feature aggregation and concatenation strategies were used to prepare data for tree-based ensemble models in unsupervised (fair-cut forest) and supervised (random forest) machine learning settings. The evaluation was carried out using 9-fold stratified cross-validation repeated 30 times. Decision fusion by averaging outputs for a subject was tested and found to be useful. Supervised models showed a consistent advantage over unsupervised ones, achieving mean AUC ROC of 0.691 (accuracy 71.11%, Kappa 0.416, F1-score 0.771) in side-based detection and mean AUC ROC of 0.721 (accuracy 68.89%, Kappa 0.371, F1-score 0.650) in patient-based detection.

下载PDF全文

下载文献需遵守相关版权规定

论文标题