论文标题
在非组织环境中具有自适应重新平衡的在线学习
Online Learning With Adaptive Rebalancing in Nonstationary Environments
论文作者
论文摘要
如今,在各种现实世界应用中以序列方式获得了巨大而不断增长的数据。在非组织环境中学习是一个重大挑战,在阶级失衡的存在下,这个问题变得更加复杂。我们提供了从在线学习中的非组织和不平衡数据学习的新见解,这是一个未开发的领域。我们提出了新型的自适应重新平衡(AREBA)算法,该算法在训练中有选择地包括迄今为止出现的大多数和少数族裔示例的子集,而其核心是一种自适应机制,可以在选定的例子之间不断保持阶级平衡。我们将AREBA与强大的基线和其他最先进的算法进行比较,并在综合和现实世界中的各种类不平衡率和不同的概念漂移类型的情况下进行广泛的实验工作。 Areba在学习速度和学习质量方面都大大优于其余的。我们的代码可公开提供给科学界。
An enormous and ever-growing volume of data is nowadays becoming available in a sequential fashion in various real-world applications. Learning in nonstationary environments constitutes a major challenge, and this problem becomes orders of magnitude more complex in the presence of class imbalance. We provide new insights into learning from nonstationary and imbalanced data in online learning, a largely unexplored area. We propose the novel Adaptive REBAlancing (AREBA) algorithm that selectively includes in the training set a subset of the majority and minority examples that appeared so far, while at its heart lies an adaptive mechanism to continually maintain the class balance between the selected examples. We compare AREBA with strong baselines and other state-of-the-art algorithms and perform extensive experimental work in scenarios with various class imbalance rates and different concept drift types on both synthetic and real-world data. AREBA significantly outperforms the rest with respect to both learning speed and learning quality. Our code is made publicly available to the scientific community.