论文标题
人口普查局重建和重建攻击的重新检查
A Re-examination of the Census Bureau Reconstruction and Reidentification Attack
论文作者
论文摘要
美国人口普查局研究人员的最新分析声称,通过重建从2010年人口普查中发布的表格数据,可以重建原始数据,并使用具有身份的准确外部数据文件重新识别1.79亿受访者(约占人口的58%)。这项研究表明,实际上可能存在无限数量的重建,并且每次重建导致在重建数据中向受访者分配不同的身份。人口普查局研究人员报告的结果仅基于这些无限可能的重建之一,并且很容易被替代的重建驳斥。如果没有明确的证据,即重建是唯一的,或者至少,大多数重建导致对同一受访者的同一身份的分配,则高度怀疑并容易被驳斥。
Recent analysis by researchers at the U.S. Census Bureau claims that by reconstructing the tabular data released from the 2010 Census, it is possible to reconstruct the original data and, using an accurate external data file with identity, reidentify 179 million respondents (approximately 58% of the population). This study shows that there are a practically infinite number of possible reconstructions, and each reconstruction leads to assigning a different identity to the respondents in the reconstructed data. The results reported by the Census Bureau researchers are based on just one of these infinite possible reconstructions and is easily refuted by an alternate reconstruction. Without definitive proof that the reconstruction is unique, or at the very least, that most reconstructions lead to the assignment of the same identity to the same respondent, claims of confirmed reidentification are highly suspect and easily refuted.