论文标题
每个查询计数:分析探索性数据分析的隐私丢失
Every Query Counts: Analyzing the Privacy Loss of Exploratory Data Analyses
论文作者
论文摘要
探索性数据分析是每位数据分析师获得见解,评估数据质量并(如果需要)选择机器学习模型以进行进一步处理的重要步骤。虽然保护隐私的机器学习正在上升,但这种初始分析通常不计入隐私预算。在本文中,我们量化了基本统计功能的隐私损失,并突出了计算机器学习方法的隐私损失预算时考虑到它的重要性。
An exploratory data analysis is an essential step for every data analyst to gain insights, evaluate data quality and (if required) select a machine learning model for further processing. While privacy-preserving machine learning is on the rise, more often than not this initial analysis is not counted towards the privacy budget. In this paper, we quantify the privacy loss for basic statistical functions and highlight the importance of taking it into account when calculating the privacy-loss budget of a machine learning approach.