论文标题
基于反事实的少数民族过度采样不平衡分类
Counterfactual-based minority oversampling for imbalanced classification
论文作者
论文摘要
在分类不平衡的分类中过采样的主要挑战是,新的少数族裔样本通常会忽略多数派的使用,从而导致大多数新的少数族裔抽样散布了整个少数群体。鉴于这一点,我们提出了一个基于反事实理论的新的过采样框架。我们的框架通过利用多数类的丰富固有信息并明确扰动多数样本来生成新样本,从而引入了反事实目标。可以分析表明,新的少数族裔样本满足最小反转,因此大多数样本位于决策边界附近。基准数据集的经验评估表明,我们的方法显着优于最新方法。
A key challenge of oversampling in imbalanced classification is that the generation of new minority samples often neglects the usage of majority classes, resulting in most new minority sampling spreading the whole minority space. In view of this, we present a new oversampling framework based on the counterfactual theory. Our framework introduces a counterfactual objective by leveraging the rich inherent information of majority classes and explicitly perturbing majority samples to generate new samples in the territory of minority space. It can be analytically shown that the new minority samples satisfy the minimum inversion, and therefore most of them locate near the decision boundary. Empirical evaluations on benchmark datasets suggest that our approach significantly outperforms the state-of-the-art methods.