Differential Privacy & the 2020 Census: Ask a Data Privacy Expert
Census data is hugely consequential, determining how electoral district boundaries are redrawn and how many Congressional representatives each state gets. The 2020 Census uses differential privacy to protect personal information, but some states have opposed the change because the method will affect redistricting.
A group of data privacy experts, including three researchers from the Hariri Institute’s Center for Reliable Information Systems and Cyber Security (RISCS)– Aloni Cohen, Ran Canetti, and Adam Smith – filed an amicus brief supporting the United States (US) Census Bureau in Alabama’s lawsuit challenging the use of differential privacy in the 2020 Census. We asked the researchers about the goal of differential privacy, and its implications:
What is differential privacy?
Cohen: Differential privacy is a family of mathematical definitions for quantifying and controlling the amount of individual-level information disclosed by computations. A computation is differentially private if the probability of seeing any particular result depends only weakly on any one person’s data. For example, one might want to compute the average salary of all of BU’s employees without disclosing information about the Hariri Institute Director’s salary. If you used a differentially private algorithm for computing averages, then anything about the Hariri Institute Director’s salary that could be gleaned from the result could also have been gleaned if the Hariri Institute Director’s salary was excluded from the computation altogether.

Why is the US Census Bureau switching to this method now?
Cohen: Differential privacy is young and the Census is complicated. Differential privacy was proposed in a 2006 paper co-authored by BU’s Adam Smith, along with Cynthia Dwork, Frank McSherry, and Kobbi Nissim. Planning for the once-per-decade Census takes years and years, and there would have been no way to use differential privacy in 2010. The Census Bureau’s use of differential privacy for the 2020 Census is on the cutting edge of disclosure avoidance in government statistical agencies.
Canetti: Another reason for why now is that the amount of data out there on every one of us has grown immensely since 2010. This “data pollution” makes preserving privacy extremely difficult, since the Census data can now be more easily combined with publicly available data to link people with their Census responses. Differential privacy is the only viable benchmark (or measure) that guarantees privacy even in the presence of arbitrary external data — hence its critical importance for the Census.

How is differential privacy different from previous methods used by the US Census Bureau for protecting private information?
Cohen: Current approaches to disclosure avoidance other than differential privacy — including those used by Census in 2010 — are inadequate and put the confidentiality of individuals at serious risk. In particular, there is a very real possibility that 2010 Census responses can be accurately “reconstructed” from the published statistics. Differential privacy is future proof — the guarantees do not depend on making brittle assumptions about the motives, methods, or sophistication of a hypothetical privacy attacker who seeks to learn information about individuals. However, the eventual strength or weakness of these guarantees depends on a tunable “privacy parameter” that has not yet been finalized.
Why is there pushback on the use of differential privacy methods?

Smith: The big change here is in how transparent the discussion of error is. Previously, Census numbers were taken to be exact, even though most people recognized they were not. In addition to people not properly counted, the Census was also adding some perturbation to the data which wasn’t publicly disclosed. The move to methods that guarantee differential privacy has forced a hard conversation about how to deal with the error inherent in any population count. The pushback is a combination of resistance to change as well as substantive questioning of whether distortion for privacy is worth it.
To read the contents of the amicus brief, click here.
Interested in learning more about the transformational science happening at the Hariri Institute? Sign up for our newsletter here.