沙普利值中的多重共线性校正和组合特征效应

论文标题

沙普利值中的多重共线性校正和组合特征效应

Multicollinearity Correction and Combined Feature Effect in Shapley Values

论文作者

Basu, Indranil, Maji, Subhadip

论文摘要

模型可解释性是大多数机器学习模型中最有趣的问题之一，尤其是对于数学上精致的人来说。计算Shapley值可以说是迄今为止在行级别上找到每个功能的重要性的最佳方法。换句话说，沙普利值代表特定行的功能的重要性，尤其是对于分类或回归问题。 Shapley Vales的最大局限性之一是，Shapley值计算假设所有特征都是不相关的（彼此独立的），此假设通常不正确。为了解决这个问题，我们提出了一个统一的框架，以计算具有相关特征的沙普利值。更具体地说，我们对特征进行调整（矩阵公式），同时计算行的独立沙普利值。此外，我们已经为上述调整提供了数学证明。通过这些调整，Shapley值（重要性）的功能变得独立于它们之间存在的相关性。我们还增强了此调整概念的不仅仅是功能。由于沙普利值是加性的，以计算两个特征的综合效果，我们只需要添加它们的单个shapley值即可。如果一个或多个功能（组合中）与其他功能（不在组合中）相关联，这再次是不正确的。我们也通过将一个功能的相关调整扩展到确定沙普利值的上述组合中的多个特征来解决这个问题。我们对此方法的实施证明，与原始莎普利方法相比，我们的方法在计算上也有效。

Model interpretability is one of the most intriguing problems in most of the Machine Learning models, particularly for those that are mathematically sophisticated. Computing Shapley Values are arguably the best approach so far to find the importance of each feature in a model, at the row level. In other words, Shapley values represent the importance of a feature for a particular row, especially for Classification or Regression problems. One of the biggest limitations of Shapley vales is that, Shapley value calculations assume all the features are uncorrelated (independent of each other), this assumption is often incorrect. To address this problem, we present a unified framework to calculate Shapley values with correlated features. To be more specific, we do an adjustment (Matrix formulation) of the features while calculating Independent Shapley values for the rows. Moreover, we have given a Mathematical proof against the said adjustments. With these adjustments, Shapley values (Importance) for the features become independent of the correlations existing between them. We have also enhanced this adjustment concept for more than features. As the Shapley values are additive, to calculate combined effect of two features, we just have to add their individual Shapley values. This is again not right if one or more of the features (used in the combination) are correlated with the other features (not in the combination). We have addressed this problem also by extending the correlation adjustment for one feature to multiple features in the said combination for which Shapley values are determined. Our implementation of this method proves that our method is computationally efficient also, compared to original Shapley method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题