论文标题
关于属性和信息理论隐私监管机构的优化
On Properties and Optimization of Information-theoretic Privacy Watchdog
论文作者
论文摘要
我们研究了数据共享中隐私保护的问题,其中$ s $是一个敏感的变量,并且$ x $是与$ s $相关的非敏感有用的有用变量。可变$ x $随机化为可变$ y $,将根据$ p_ {y | x}(y | x)$共享或发布。我们通过\ emph {信息隐私}(也称为文献中的\ emph {log-lift})测量隐私泄漏,这保证了共同的信息隐私和差异隐私(DP)。令$ \ xepsc \ subseteq \ x $包含元素n $ x $的字母,该字母的log-lift(简称ABS-LOG-LIFT)的绝对值大于所需的阈值$ \ eps $。当elements $ x \ in \ xepsc $被随机分为$ y \ in \ y $时,我们将在结果对$(s,y)$的abs-log-lift中得出最佳的上限。然后,我们证明可以通过\ emph {$ x $ -Invariant}随机化$ p(y | x)= r(y)$ for $ x,y \ in \ xepsc $实现此界限。但是,通过在ABS-LOG-LIFT上施加严格的上限上限$ \ eps $,通过共同信息$ i(x; y)$衡量的实用程序受到严重损坏。为了解决此问题,并受到概率($ \ eps $,$δ$)的启发 - DP,我们提出了一个放松的($ \ eps $,$δ$) - log-lift框架。为了实现这种放松,我们引入了一种贪婪的算法,该算法将$ \ xepsc $中的某些元素免于随机化,只要它们的abs-log-lift由$ \ eps $带有概率$ 1-Δ$的限制。数值结果表明,该算法在实现更好的隐私性权衡方面的功效。
We study the problem of privacy preservation in data sharing, where $S$ is a sensitive variable to be protected and $X$ is a non-sensitive useful variable correlated with $S$. Variable $X$ is randomized into variable $Y$, which will be shared or released according to $p_{Y|X}(y|x)$. We measure privacy leakage by \emph{information privacy} (also known as \emph{log-lift} in the literature), which guarantees mutual information privacy and differential privacy (DP). Let $\Xepsc \subseteq \X$ contain elements n the alphabet of $X$ for which the absolute value of log-lift (abs-log-lift for short) is greater than a desired threshold $\eps$. When elements $x\in \Xepsc$ are randomized into $y\in \Y$, we derive the best upper bound on the abs-log-lift across the resultant pairs $(s,y)$. We then prove that this bound is achievable via an \emph{$X$-invariant} randomization $p(y|x) = R(y)$ for $x,y\in\Xepsc$. However, the utility measured by the mutual information $I(X;Y)$ is severely damaged in imposing a strict upper bound $\eps$ on the abs-log-lift. To remedy this and inspired by the probabilistic ($\eps$, $δ$)-DP, we propose a relaxed ($\eps$, $δ$)-log-lift framework. To achieve this relaxation, we introduce a greedy algorithm which exempts some elements in $\Xepsc$ from randomization, as long as their abs-log-lift is bounded by $\eps$ with probability $1-δ$. Numerical results demonstrate efficacy of this algorithm in achieving a better privacy-utility tradeoff.