Copyright Notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder. Also, some of these works have been submitted for publication. Copyright may be transferred without further notice and this version may no longer be accessible.
OpenDP is a community effort to build a trustworthy suite of open-source tools for enabling privacy-protective analysis of sensitive personal data, focused on a library of algorithms for generating differentially private statistical releases. The target use cases for OpenDP are to enable government, industry, and academic institutions to safely and confidently share sensitive data to support scientifically oriented research and exploration in the public interest. We aim for OpenDP to flexibly grow with the rapidly advancing science of differential privacy, and be a pathway to bring the newest algorithmic developments to a wide array of practitioners.
OpenDP is led by Faculty Directors Gary King and Salil Vadhan and an Executive Committee at Harvard University, funded in part by a grant from the Sloan Foundation. Its efforts so far have included implementing a differentially private curator application in collaboration with Microsoft, and developing a framework for a community-driven OpenDP Commons through the work of an Ad Hoc Design Committee including external experts. Going forward, the project plans to engage with a wide community of stakeholders, establish partnerships with a wide variety of groups from industry, academia, and government, and adopt increasing levels of community governance.
In this survey, we present several computational analogues of entropy and illustrate how they are useful for constructing cryptographic primitives. Specifically, we focus on constructing pseudorandom generators and statistically hiding commitments from arbitrary one-way functions, and demonstrate that:
The security properties of these (and other) cryptographic primitives can be understood in terms of various computational analogues of entropy, and in particular how these computational measures of entropy can be very different from real, information-theoretic entropy.
It can be shown that every one-way function directly exhibits some gaps between real entropy and the various computational entropies.
Thus we can construct the desired cryptographic primitives by amplifying and manipulating the entropy gaps in a one-way function, through forms of repetition and hashing.
The constructions we present (which are from the past decade) are much simpler and more efficient than the original ones, and are based entirely on natural manipulations of new notions of computational entropy. The two constructions are "dual" to each other, whereby the construction of pseudorandom generators relies on a form of computational entropy ("pseudoentropy") being larger than the real entropy, while the construction of statistically hiding commitments relies on a form of computational entropy ("accessible entropy") being smaller than the real entropy. Beyond that difference, the two constructions share a common structure, using a very similar sequence of manipulations of real and computational entropy. As a warmup, we also "deconstruct" the classic construction of pseudorandom generators from one-way permutations using the modern language of computational entropy.
This survey is written in honor of Shafi Goldwasser and Silvio Micali.
Version History: Preliminary version workshopped at PLSC 2017.
Differential privacy is a formal mathematical framework for quantifying and managing privacy risks. It provides provable privacy protection against a wide range of potential attacks, including those currently unforeseen. Differential privacy is primarily studied in the context of the collection, analysis, and release of aggregate statistics. These range from simple statistical estimations, such as averages, to machine learning. Tools for differentially private analysis are now in early stages of implementation and use across a variety of academic, industry, and government settings. Interest in the concept is growing among potential users of the tools, as well as within legal and policy communities, as it holds promise as a potential approach to satisfying legal requirements for privacy protection when handling personal information. In particular, differential privacy may be seen as a technical solution for analyzing and sharing data while protecting the privacy of individuals in accordance with existing legal or policy requirements for de-identification or disclosure limitation.
This primer seeks to introduce the concept of differential privacy and its privacy implications to non-technical audiences. It provides a simplified and informal, but mathematically accurate, description of differential privacy. Using intuitive illustrations and limited mathematical formalism, it discusses the definition of differential privacy, how differential privacy addresses privacy risks, how differentially private analyses are constructed, and how such analyses can be used in practice. A series of illustrations is used to show how practitioners and policymakers can conceptualize the guarantees provided by differential privacy. These illustrations are also used to explain related concepts, such as composition (the accumulation of risk across multiple analyses), privacy loss parameters, and privacy budgets. This primer aims to provide a foundation that can guide future decisions when analyzing and sharing statistical data about individuals, informing individuals about the privacy protection they will be afforded, and designing policies and regulations for robust privacy protection.
Computational analogues of information-theoretic notions have given rise to some of the most interesting phenomena in the theory of computation. For example, computational indistinguishability, Goldwasser and Micali , which is the computational analogue of statistical distance, enabled the bypassing of Shannon’s impossibility results on perfectly secure encryption, and provided the basis for the computational theory of pseudorandomness. Pseudoentropy, Håstad, Impagliazzo, Levin, and Luby , a computational analogue of entropy, was the key to the fundamental result establishing the equivalence of pseudorandom generators and one-way functions, and has become a basic concept in complexity theory and cryptography.
This tutorial discusses two rather recent computational notions of entropy, both of which can be easily found in any one-way function, the most basic cryptographic primitive. The first notion is next-block pseudoentropy, Haitner, Reingold, and Vadhan , a refinement of pseudoentropy that enables simpler and more ecient construction of pseudorandom generators. The second is inaccessible entropy, Haitner, Reingold, Vadhan, andWee , which relates to unforgeability and is used to construct simpler and more efficient universal one-way hash functions and statistically hiding commitments.
March 2017: Manuscript v2 (see files attached); Errata
April 2017: Published Version (in Tutorials on the Foundations of Cryptography; see Publisher's Version link and also SPRINGER 2017.PDF, below)
Differential privacy is a theoretical framework for ensuring the privacy of individual-level data when performing statistical analysis of privacy-sensitive datasets. This tutorial provides an introduction to and overview of differential privacy, with the goal of conveying its deep connections to a variety of other topics in computational complexity, cryptography, and theoretical computer science at large. This tutorial is written in celebration of Oded Goldreich’s 60th birthday, starting from notes taken during a minicourse given by the author and Kunal Talwar at the 26th McGill Invitational Workshop on Computational Complexity .