论文标题
学习最佳公平分类树:可解释性,公平性和准确性之间的权衡
Learning Optimal Fair Classification Trees: Trade-offs Between Interpretability, Fairness, and Accuracy
论文作者
论文摘要
在高风险领域(人们的生计受到影响)中,机器学习的日益增长的使用迫切需要解释,公平和高度准确的算法。考虑到这些需求,我们提出了一个用于学习最佳分类树的混合整数优化(MIO)框架(最容易解释的模型之一),可以通过任意的公平约束来增强。为了更好地量化“可解释性的价格”,我们还提出了一种称为决策复杂性的新型模型可解释性量度,可以在不同类别的机器学习模型上进行比较。我们基于对流行数据集公平分类的最先进方法进行基准测试;在此过程中,我们对可解释性,公平性和预测精度之间的权衡进行了最早的全面分析。鉴于固定的差异阈值,我们的方法的价格约为4.2个百分点,而样本外的精度与最佳性能,复杂的模型相比。但是,我们的方法始终找到几乎完全奇偶校验的决策,而其他方法很少做到。
The increasing use of machine learning in high-stakes domains -- where people's livelihoods are impacted -- creates an urgent need for interpretable, fair, and highly accurate algorithms. With these needs in mind, we propose a mixed integer optimization (MIO) framework for learning optimal classification trees -- one of the most interpretable models -- that can be augmented with arbitrary fairness constraints. In order to better quantify the "price of interpretability", we also propose a new measure of model interpretability called decision complexity that allows for comparisons across different classes of machine learning models. We benchmark our method against state-of-the-art approaches for fair classification on popular datasets; in doing so, we conduct one of the first comprehensive analyses of the trade-offs between interpretability, fairness, and predictive accuracy. Given a fixed disparity threshold, our method has a price of interpretability of about 4.2 percentage points in terms of out-of-sample accuracy compared to the best performing, complex models. However, our method consistently finds decisions with almost full parity, while other methods rarely do.