The U.S. Census Bureau Tries to be a Good Data Steward in the 21st Century


The Fundamental Law of Information Reconstruction, a.k.a. the Database Reconstruction Theorem, not only exposes a vulnerability in the way statistical agencies have traditionally published data, but it also exposes the same vulnerability in the way Amazon, Apple, Facebook, Google, and other Internet giants publish data. Solutions are needed as to how to publish information from these data while still providing meaningful privacy and confidentiality protections to providers.

Fortunately for the public, the U.S. Census Bureau's curation of their data is regulated by a strict law that mandates publication for statistical purposes only, not exposing the data of any respondent that could identify them as the source of specific items. The Census Bureau has interpreted that stricture as governed by the laws of probability. An external user should not be able to assert with reasonable certainty that particular values were supplied by an identified respondent. Traditional methods of disclosure avoidance fail because they are not able to quantify that risk. Moreover, when these methods are assessed using current tools, the relative certainty with which specific values can be associated with identifiable individuals turns out to be much greater than anticipated.

In this Cyber Alliance talk, Census Bureau Assoc. Director for Research and Methodology and Cornell Prof. John Abowd will discuss how his agency has responded to these developments. The Census Bureau has committed to a transparent modernization of its data publishing systems using formal methods like differential privacy. The intention is to demonstrate that statistical data, fit for their intended uses, can be produced when the entire publication system is subject to a formal privacy-loss budget.

3:30pm on Wednesday, October 23rd 2019

