论文标题
基于自动编码器的属性噪声处理方法
Autoencoder-based Attribute Noise Handling Method for Medical Data
论文作者
论文摘要
医学数据集特别受属性噪声的约束,即缺失和错误的值。已知属性噪声在很大程度上对学习表现有害。为了最大程度地提高未来的学习表现,在任何推论之前处理属性噪声是原始的。我们提出了一种简单的基于自动编码器的预处理方法,该方法可以纠正被属性噪声损坏的混合型表格数据。目前尚无其他方法来处理表格数据中的属性噪声。我们在实验上证明,我们的方法在几个现实世界中的医学数据集上都优于最先进的插补方法和噪声校正方法。
Medical datasets are particularly subject to attribute noise, that is, missing and erroneous values. Attribute noise is known to be largely detrimental to learning performances. To maximize future learning performances it is primordial to deal with attribute noise before any inference. We propose a simple autoencoder-based preprocessing method that can correct mixed-type tabular data corrupted by attribute noise. No other method currently exists to handle attribute noise in tabular data. We experimentally demonstrate that our method outperforms both state-of-the-art imputation methods and noise correction methods on several real-world medical datasets.