值得信赖的医学联合学习的奖励系统

论文标题

值得信赖的医学联合学习的奖励系统

Reward Systems for Trustworthy Medical Federated Learning

论文作者

Pandl, Konstantin D., Leiser, Florian, Thiebes, Scott, Sunyaev, Ali

论文摘要

联邦学习（FL）受到研究人员和从业人员的浓厚兴趣，以培训机器学习（ML）模型。确保这些模型的可信度至关重要。尤其是偏见，被定义为模型跨不同子组的预测性能的差异，可能会导致针对特定亚组的不公平性，这是值得信赖的ML模型的不希望的现象。在这项研究中，我们解决了一个问题，即医疗FL中发生偏见以及如何通过奖励系统防止过度偏见。我们首先评估了如何使用Shapley值近似方法来衡量机构对跨索洛医学FL中预测性能和偏见的贡献。在第二步中，我们设计了不同的奖励系统激励对高预测性能或低偏见的贡献。然后，我们提出了一个综合奖励系统，以激励两者的贡献。我们使用多个医疗胸部X射线数据集评估我们的工作，这些数据集专注于患者性别和年龄定义的患者子组。我们的结果表明，我们可以成功地衡量对偏见的贡献，并且综合奖励系统成功地激发了对具有低偏见的表现良好模型的贡献。尽管扫描的分区仅略微影响整体偏见，但具有数据的主要来自一个亚组的机构对该亚组带来了有利的偏见。我们的结果表明，仅关注预测性能的奖励系统可以将对患者的模型偏见转移到机构水平上。我们的工作帮助研究人员和从业人员为FL设计奖励系统，并为可信赖的ML提供了很好的激励措施。

Federated learning (FL) has received high interest from researchers and practitioners to train machine learning (ML) models for healthcare. Ensuring the trustworthiness of these models is essential. Especially bias, defined as a disparity in the model's predictive performance across different subgroups, may cause unfairness against specific subgroups, which is an undesired phenomenon for trustworthy ML models. In this research, we address the question to which extent bias occurs in medical FL and how to prevent excessive bias through reward systems. We first evaluate how to measure the contributions of institutions toward predictive performance and bias in cross-silo medical FL with a Shapley value approximation method. In a second step, we design different reward systems incentivizing contributions toward high predictive performance or low bias. We then propose a combined reward system that incentivizes contributions toward both. We evaluate our work using multiple medical chest X-ray datasets focusing on patient subgroups defined by patient sex and age. Our results show that we can successfully measure contributions toward bias, and an integrated reward system successfully incentivizes contributions toward a well-performing model with low bias. While the partitioning of scans only slightly influences the overall bias, institutions with data predominantly from one subgroup introduce a favorable bias for this subgroup. Our results indicate that reward systems, which focus on predictive performance only, can transfer model bias against patients to an institutional level. Our work helps researchers and practitioners design reward systems for FL with well-aligned incentives for trustworthy ML.

下载PDF全文

下载文献需遵守相关版权规定

论文标题