用仿冒小波的细菌拉曼光谱的可解释分类

论文标题

用仿冒小波的细菌拉曼光谱的可解释分类

Interpretable Classification of Bacterial Raman Spectra with Knockoff Wavelets

论文作者

Chia, Charmaine, Sesia, Matteo, Ho, Chi-Sing, Jeffrey, Stefanie S., Dionne, Jennifer, Candès, Emmanuel J., Howe, Roger T.

论文摘要

深度神经网络和其他复杂的机器学习模型被广泛应用于生物医学信号数据，因为它们可以检测复杂的模式并计算准确的预测。但是，解释这种模型的困难是一个限制，尤其是对于涉及高风险决策的应用，包括鉴定细菌感染。在本文中，我们考虑了快速的拉曼光谱数据，并证明具有精心选择功能的逻辑回归模型可以达到与神经网络相当的精度，同时更简单，更透明。我们的分析利用了具有直观化学解释的小波特征，并用仿制进行受控的变量选择，以确保预测因子相关且不冗余。尽管我们专注于特定的数据集，但所提出的方法广泛适用于其他类型的信号数据，这些信号数据可能很重要。

Deep neural networks and other sophisticated machine learning models are widely applied to biomedical signal data because they can detect complex patterns and compute accurate predictions. However, the difficulty of interpreting such models is a limitation, especially for applications involving high-stakes decision, including the identification of bacterial infections. In this paper, we consider fast Raman spectroscopy data and demonstrate that a logistic regression model with carefully selected features achieves accuracy comparable to that of neural networks, while being much simpler and more transparent. Our analysis leverages wavelet features with intuitive chemical interpretations, and performs controlled variable selection with knockoffs to ensure the predictors are relevant and non-redundant. Although we focus on a particular data set, the proposed approach is broadly applicable to other types of signal data for which interpretability may be important.

下载PDF全文

下载文献需遵守相关版权规定

论文标题