# Publications

2012
Dwork, Cynthia, Moni Naor, and Salil Vadhan. “The privacy of the analyst and the power of the state.” In Proceedings of the 53rd Annual IEEE Symposium on Foundations of Computer Science (FOCS ‘12), 400-409. IEEE, 2012. Publisher's VersionAbstract
We initiate the study of privacy for the analyst in differentially private data analysis. That is, not only will we be concerned with ensuring differential privacy for the data (i.e. individuals or customers), which are the usual concern of differential privacy, but we also consider (differential) privacy for the set of queries posed by each data analyst. The goal is to achieve privacy with respect to other analysts, or users of the system. This problem arises only in the context of stateful privacy mechanisms, in which the responses to queries depend on other queries posed (a recent wave of results in the area utilized cleverly coordinated noise and state in order to allow answering privately hugely many queries). We argue that the problem is real by proving an exponential gap between the number of queries that can be answered (with non-trivial error) by stateless and stateful differentially private mechanisms. We then give a stateful algorithm for differentially private data analysis that also ensures differential privacy for the analyst and can answer exponentially many queries.
Gopalan, Parikshit, Raghu Meka, Omer Reingold, Luca Tevisan, and Salil Vadhan. “Better pseudorandom generators via milder pseudorandom restrictions.” In Proceedings of the 53rd Annual IEEE Symposium on Foundations of Computer Science (FOCS ‘12), 120-129. IEEE, 2012. Publisher's VersionAbstract

Version HistoryFull version posted as ECCC TR12-123 and as arXiv:1210.0049 [cs.CC].

We present an iterative approach to constructing pseudorandom generators, based on the repeated application of mild pseudorandom restrictions. We use this template to construct pseudorandom generators for combinatorial rectangles and read-once $$\mathsf{CNF}$$s and a hitting set generator for width-3 branching programs, all of which achieve near-optimal seed-length even in the low-error regime: We get seed-length $$\tilde{O}(\log(n/\epsilon))$$ for error $$\epsilon$$. Previously, only constructions with seed-length $$O(log^{3/2}n)$$ or $$O(log^2n)$$were known for these classes with error $$\epsilon = 1/\mathrm{poly}(n)$$. The (pseudo)random restrictions we use are milder than those typically used for proving circuit lower bounds in that we only set a constant fraction of the bits at a time. While such restrictions do not simplify the functions drastically, we show that they can be derandomized using small-bias spaces.

2011
Kamp, Jesse, Anup Rao, Salil Vadhan, and David Zuckerman. “Deterministic extractors for small-space sources.” Journal of Computer and System Sciences 77, no. 1 (2011): 191-220. Publisher's VersionAbstract

Version History: Special issue to celebrate Richard Karp's Kyoto Prize. Extended abstract in STOC '06.

We give polynomial-time, deterministic randomness extractors for sources generated in small space, where we model space $$s$$ sources on$$\{0,1\}^n$$ as sources generated by width $$2^s$$ branching programs. Specifically, there is a constant $$η > 0$$ such that for any $$ζ > n^{−η}$$, our algorithm extracts $$m = (δ − ζ)n$$ bits that are exponentially close to uniform (in variation distance) from space $$s$$ sources with min-entropy $$δn$$, where $$s = Ω(ζ^ 3n)$$. Previously, nothing was known for $$δ \ll 1/2,$$, even for space $$0$$. Our results are obtained by a reduction to the class of total-entropy independent sources. This model generalizes both the well-studied models of independent sources and symbol-fixing sources. These sources consist of a set of $$r$$ independent smaller sources over $$\{0, 1\}^\ell$$, where the total min-entropy over all the smaller sources is $$k$$. We give deterministic extractors for such sources when $$k$$ is as small as $$\mathrm{polylog}(r)$$, for small enough $$\ell$$.

Chung, Kai-Min, Omer Reingold, and Salil Vadhan. “S-T connectivity on digraphs with a known stationary distribution.” In ACM Transactions on Algorithms. Vol. 7. 3rd ed. ACM, 2011. Publisher's VersionAbstract

Version history: Preliminary versions in CCC '07 and on ECCC (TR07-030).

We present a deterministic logspace algorithm for solving S-T Connectivity on directed graphs if: (i) we are given a stationary distribution of the random walk on the graph in which both of the input vertices $$s$$ and $$t$$ have nonnegligible probability mass and (ii) the random walk which starts at the source vertex $$s$$ has polynomial mixing time. This result generalizes the recent deterministic logspace algorithm for S-T Connectivity on undirected graphs [Reingold, 2008]. It identifies knowledge of the stationary distribution as the gap between the S-T Connectivity problems we know how to solve in logspace (L) and those that capture all of randomized logspace (RL).

Dvir, Zeev, Dan Gutfreund, Guy Rothblum, and Salil Vadhan. “On approximating the entropy of polynomial mappings.” In Proceedings of the Second Symposium on Innovations in Computer Science (ICS 2011), 460-475. Tsinghua University Press, 2011. Publisher's VersionAbstract

Version HistoryFull version posted as ECCC TR10-60.

We investigate the complexity of the following computational problem:

Polynomial Entropy Approximation (PEA): Given a low-degree polynomial mapping $$p : \mathbb{F}^n → \mathbb{F}^m$$, where F is a finite field, approximate the output entropy $$H(p(U_n))$$, where $$U_n$$ is the uniform distribution on $$\mathbb{F}^n$$ and $$H$$ may be any of several entropy measures.

We show:

• Approximating the Shannon entropy of degree 3 polynomials $$p : \mathbb{F}_2^n \to \mathbb{F}^m_2$$ over $$\mathbb{F}_2$$ to within an additive constant (or even $$n^.9$$) is complete for $$\mathbf{SZKP_L}$$, the class of problems having statistical zero-knowledge proofs where the honest verifier and its simulator are computable in logarithmic space. ($$\mathbf{SZKP_L}$$contains most of the natural problems known to be in the full class $$\mathbf{SZKP}$$.)

• For prime fields $$\mathbb{F} \neq \mathbb{F}_2$$ and homogeneous quadratic polynomials $$p : \mathbb{F}^n \to \mathbb{F}^m$$, there is a probabilistic polynomial-time algorithm that distinguishes the case that $$p(U_n)$$) has entropy smaller than k from the case that $$p(U_n))$$ has min-entropy (or even Renyi entropy) greater than $$(2 + o(1))k$$.

• For degree d polynomials $$p : \mathbb{F}^n_2 \to \mathbb{F}^m_2$$ , there is a polynomial-time algorithm that distinguishes the case that $$p(U_n)$$ has max-entropy smaller than $$k$$ (where the max-entropy of a random variable is the logarithm of its support size) from the case that $$p(U_n)$$ has max-entropy at least $$(1 + o(1)) \cdot k^d$$ (for fixed $$d$$ and large $$k$$).

Mahmoody, Mohammad, Tal Moran, and Salil Vadhan. “Time-lock puzzles in the random oracle model.” In P. Rogaway, editor, Advances in Cryptology—CRYPTO ‘11, Lecture Notes in Computer Science, 6841:39-50. Springer-Verlag, 2011. Publisher's VersionAbstract

A time-lock puzzle is a mechanism for sending messages “to the future”. The sender publishes a puzzle whose solution is the message to be sent, thus hiding it until enough time has elapsed for the puzzle to be solved. For time-lock puzzles to be useful, generating a puzzle should take less time than solving it. Since adversaries may have access to many more computers than honest solvers, massively parallel solvers should not be able to produce a solution much faster than serial ones.

To date, we know of only one mechanism that is believed to satisfy these properties: the one proposed by Rivest, Shamir and Wagner (1996), who originally introduced the notion of time-lock puzzles. Their puzzle is based on the serial nature of exponentiation and the hardness of factoring, and is therefore vulnerable to advances in factoring techniques (as well as to quantum attacks).

In this work, we study the possibility of constructing time-lock puzzles in the random-oracle model. Our main result is negative, ruling out time-lock puzzles that require more parallel time to solve than the total work required to generate a puzzle. In particular, this should rule out black-box constructions of such time-lock puzzles from one-way permutations and collision-resistant hash-functions. On the positive side, we construct a time-lock puzzle with a linear gap in parallel time: a new puzzle can be generated with one round of $${n}$$ parallel queries to the random oracle, but $${n}$$ rounds of serial queries are required to solve it (even for massively parallel adversaries).

2010
Guruswami, Venkatesan, and Salil Vadhan. “A lower bound on list size for list decoding.” IEEE Transactions on Information Theory 56, no. 11 (2010): 5681-5688. Publisher's VersionAbstract

Version History: Preliminary version published in RANDOM '05 (https://link.springer.com/chapter/10.1007/11538462_27) and attached as RANDOM2005.pdf.

q-ary error-correcting code $$C ⊆ \{1,2,...,q\}^n$$ is said to be list decodable to radius $$\rho$$ with list size L if every Hamming ball of radius ρ contains at most L codewords of C. We prove that in order for a q-ary code to be list-decodable up to radius $$(1–1/q)(1–ε)n$$, we must have $$L = Ω(1/ε^2)$$. Specifically, we prove that there exists a constant $$c_q >0$$ and a function $$f_q$$ such that for small enough $$ε > 0$$, if C is list-decodable to radius$$(1–1/q)(1–ε)n$$with list size $$c_q /ε^2$$, then C has at most $$f q (ε)$$codewords, independent of n. This result is asymptotically tight (treating q as a constant), since such codes with an exponential (in n) number of codewords are known for list size $$L = O(1/ε^2)$$.

A result similar to ours is implicit in Blinovsky [Bli] for the binary $$(q=2)$$ case. Our proof works for all alphabet sizes, and is technically and conceptually simpler.

Rothblum, Guy, and Salil Vadhan. “Are PCPs inherent in efficient arguments?Computational Complexity 19, no. 2 (2010): 265-304. Publisher's VersionAbstract

Version HistorySpecial Issue on CCC '09.

Starting with Kilian (STOC ‘92), several works have shown how to use probabilistically checkable proofs (PCPs) and cryptographic primitives such as collision-resistant hashing to construct very efficient argument systems (a.k.a. computationally sound proofs), for example with polylogarithmic communication complexity. Ishai et al. (CCC ‘07) raised the question of whether PCPs are inherent in efficient arguments, and if so, to what extent. We give evidence that they are, by showing how to convert any argument system whose soundness is reducible to the security of some cryptographic primitive into a PCP system whose efficiency is related to that of the argument system and the reduction (under certain complexity assumptions).

Birrell, Eleanor, and Salil Vadhan. “Composition of zero-knowledge proofs with efficient provers.” In Daniele Micciancio, editor, Proceedings of the 7th IACR Theory of Cryptography Conference (TCC ‘10), Lecture Notes on Computer Science, 5978:572-587. Springer-Verlag, 2010. Publisher's VersionAbstract

We revisit the composability of different forms of zero- knowledge proofs when the honest prover strategy is restricted to be polynomial time (given an appropriate auxiliary input). Our results are:

1. When restricted to efficient provers, the original Goldwasser–Micali–Rackoff (GMR) definition of zero knowledge (STOC ‘85), here called plain zero knowledge, is closed under a constant number of sequential compositions (on the same input). This contrasts with the case of unbounded provers, where Goldreich and Krawczyk (ICALP ‘90, SICOMP ‘96) exhibited a protocol that is zero knowledge under the GMR definition, but for which the sequential composition of 2 copies is not zero knowledge.

2. If we relax the GMR definition to only require that the simulation is indistinguishable from the verifier’s view by uniform polynomial-time distinguishers, with no auxiliary input beyond the statement being proven, then again zero knowledge is not closed under sequential composition of 2 copies.

3. We show that auxiliary-input zero knowledge with efficient provers is not closed under parallel composition of 2 copies under the assumption that there is a secure key agreement protocol (in which it is easy to recognize valid transcripts). Feige and Shamir (STOC ‘90) gave similar results under the seemingly incomparable assumptions that (a) the discrete logarithm problem is hard, or (b) $$\mathcal{UP}\nsubseteq\mathcal{BPP}$$ and one-way functions exist.
Dwork, Cynthia, Guy Rothblum, and Salil Vadhan. “Boosting and differential privacy.” In Proceedings of the 51st Annual IEEE Symposium on Foundations of Computer Science (FOCS ‘10), 51-60. IEEE, 2010. Publisher's VersionAbstract
Boosting is a general method for improving the accuracy of learning algorithms. We use boosting to construct improved privacy-preserving synopses of an input database. These are data structures that yield, for a given set $$\mathcal{Q}$$ of queries over an input database, reasonably accurate estimates of the responses to every query in $$\mathcal{Q}$$, even when the number of queries is much larger than the number of rows in the database. Given a base synopsis generator that takes a distribution on $$\mathcal{Q}$$ and produces a "weak" synopsis that yields "good" answers for a majority of the weight in $$\mathcal{Q}$$, our Boosting for Queries algorithm obtains a synopsis that is good for all of $$\mathcal{Q}$$. We ensure privacy for the rows of the database, but the boosting is performed on the queries. We also provide the first synopsis generators for arbitrary sets of arbitrary low-sensitivity queries, i.e., queries whose answers do not vary much under the addition or deletion of a single row. In the execution of our algorithm certain tasks, each incurring some privacy loss, are performed many times. To analyze the cumulative privacy loss, we obtain an $$O(\varepsilon^2)$$ bound on the expected privacy loss from a single $$\varepsilon$$-differentially private mechanism. Combining this with evolution of confidence arguments from the literature, we get stronger bounds on the expected cumulative privacy loss due to multiple mechanisms, each of which provides $$\varepsilon$$-differential privacy or one of its relaxations, and each of which operates on (potentially) different, adaptively chosen, databases.
Chung, Kai-Min, Yael Kalai, and Salil Vadhan. “Improved delegation of computation using fully homomorphic encryption.” In T. Rabin, editor, Advances in Cryptology—CRYPTO ‘10, Lecture Notes in Computer Science, 6223:483-501. Springer-Verlag, 2010. Publisher's VersionAbstract

Version HistoryFull version posted as Cryptology ePrint Archive Report 210/241.

Following Gennaro, Gentry, and Parno (Cryptology ePrint Archive 2009/547), we use fully homomorphic encryption to design improved schemes for delegating computation. In such schemes, a delegator outsources the computation of a function $${F}$$ on many, dynamically chosen inputs $$x_i$$ to a worker in such a way that it is infeasible for the worker to make the delegator accept a result other than $${F}(x_i)$$. The “online stage” of the Gennaro et al. scheme is very efficient: the parties exchange two messages, the delegator runs in time poly$$(\log{T})$$, and the worker runs in time poly$$(T)$$, where $$T$$ is the time complexity of $$F$$. However, the “offline stage” (which depends on the function $$F$$ but not the inputs to be delegated) is inefficient: the delegator runs in time poly$$(T)$$ and generates a public key of length poly$$(T)$$ that needs to be accessed by the worker during the online stage.

Our first construction eliminates the large public key from the Gennaro et al. scheme. The delegator still invests poly$$(T)$$ time in the offline stage, but does not need to communicate or publish anything. Our second construction reduces the work of the delegator in the offline stage to poly$$(\log{T})$$ at the price of a 4-message (offline) interaction with a poly$$(T)$$-time worker (which need not be the same as the workers used in the online stage). Finally, we describe a “pipelined” implementation of the second construction that avoids the need to re-run the offline construction after errors are detected (assuming errors are not too frequent).

McGregor, Andrew, Ilya Mironov, Toniann Pitassi, Omer Reingold, Kunal Talwar, and Salil Vadhan. “The limits of two-party differential privacy.” In Proceedings of the 51st Annual IEEE Symposium on Foundations of Computer Science (FOCS ‘10), 81-90. IEEE, 2010. Publisher's VersionAbstract

Version History and Errata: Subsequent version published in ECCC 2011. Proposition 8 and Part (b) of Theorem 13 in the FOCS version are incorrect, and are removed from the ECCC version.

We study differential privacy in a distributed setting where two parties would like to perform analysis of their joint data while preserving privacy for both datasets. Our results imply almost tight lower bounds on the accuracy of such data analyses, both for specific natural functions (such as Hamming distance) and in general. Our bounds expose a sharp contrast between the two-party setting and the simpler client-server setting (where privacy guarantees are one-sided). In addition, those bounds demonstrate a dramatic gap between the accuracy that can be obtained by differentially private data analysis versus the accuracy obtainable when privacy is relaxed to a computational variant of differential privacy. The first proof technique we develop demonstrates a connection between differential privacy and deterministic extraction from Santha-Vazirani sources. A second connection we expose indicates that the ability to approximate a function by a low-error differentially private protocol is strongly related to the ability to approximate it by a low communication protocol. (The connection goes in both directions).

2009
Guruswami, Venkatesan, Christopher Umans, and Salil Vadhan. “Unbalanced expanders and randomness extractors from Parvaresh–Vardy codes.” Journal of the ACM 56, no. 4 (2009): 1–34. Publisher's VersionAbstract

Version History: Preliminary versions of this article appeared as Technical Report TR06-134 in Electronic Colloquium on Computational Complexity, 2006, and in Proceedings of the 22nd Annual IEEE Conference on Computional Complexity (CCC '07), pp. 96–108. Preliminary version recipient of Best Paper Award at CCC '07.

We give an improved explicit construction of highly unbalanced bipartite expander graphs with expansion arbitrarily close to the degree (which is polylogarithmic in the number of vertices). Both the degree and the number of right-hand vertices are polynomially close to optimal, whereas the previous constructions of Ta-Shma et al. [2007] required at least one of these to be quasipolynomial in the optimal. Our expanders have a short and self-contained description and analysis, based on the ideas underlying the recent list-decodable error-correcting codes of Parvaresh and Vardy [2005].

Our expanders can be interpreted as near-optimal “randomness condensers,” that reduce the task of extracting randomness from sources of arbitrary min-entropy rate to extracting randomness from sources of min-entropy rate arbitrarily close to 1, which is a much easier task. Using this connection, we obtain a new, self-contained construction of randomness extractors that is optimal up to constant factors, while being much simpler than the previous construction of Lu et al. [2003] and improving upon it when the error parameter is small (e.g., 1/poly(n)).

Haitner, Iftach, Minh Nguyen, Shien Jin Ong, Omer Reingold, and Salil Vadhan. “Statistically hiding commitments and statistical zero-knowledge arguments from any one-way function.” SIAM Journal on Computing 39, no. 3 (2009): 1153-1218. Publisher's VersionAbstract

Version HistorySpecial Issue on STOC ‘07. Merge of papers from FOCS ‘06 and STOC ‘07. Received SIAM Outstanding Paper Prize 2011.

We give a construction of statistically hiding commitment schemes (those in which the hiding property holds against even computationally unbounded adversaries) under the minimal complexity assumption that one-way functions exist. Consequently, one-way functions suffice to give statistical zero-knowledge arguments for any NP statement (whereby even a computationally unbounded adversarial verifier learns nothing other than the fact that the assertion being proven is true, and no polynomial-time adversarial prover can convince the verifier of a false statement). These results resolve an open question posed by Naor et al. [J. Cryptology, 11 (1998), pp. 87–108].

Ong, Shien Jin, David Parkes, Alon Rosen, and Salil Vadhan. “Fairness with an honest minority and a rational majority.” In O. Reingold, editor, Proceedings of the Fourth Theory of Cryptography Conference (TCC ‘09), Lecture Notes in Computer Science, 5444:36-53. Springer-Verlag, 2009. Publisher's VersionAbstract

Version HistoryPreliminary version posted as Cryptology ePrint Archive Report 2008/097, March 2008.

We provide a simple protocol for secret reconstruction in any threshold secret sharing scheme, and prove that it is fair when executed with many rational parties together with a small minority of honest parties. That is, all parties will learn the secret with high probability when the honest parties follow the protocol and the rational parties act in their own self-interest (as captured by a set-Nash analogue of trembling hand perfect equilibrium). The protocol only requires a standard (synchronous) broadcast channel, tolerates both early stopping and incorrectly computed messages, and only requires 2 rounds of communication.

Previous protocols for this problem in the cryptographic or economic models have either required an honest majority, used strong communication channels that enable simultaneous exchange of information, or settled for approximate notions of security/equilibria. They all also required a nonconstant number of rounds of communication.

Dodis, Yevgeniy, Salil Vadhan, and Daniel Wichs. “Proofs of retrievability via hardness amplification.” In O. Reingold, editor, Proceedings of the Fourth Theory of Cryptography Conference (TCC ‘09), Lecture Notes in Computer Science, 5444:109-127. Springer-Verlag, 2009. Publisher's VersionAbstract

Version History: Originally presented at Theory of Cryptography Conference (TCC) 2009. Full version published in Cryptology ePrint Archive (attached as ePrint2009).

Proofs of Retrievability (PoR), introduced by Juels and Kaliski [JK07], allow the client to store a file F on an untrusted server, and later run an efficient audit protocol in which the server proves that it (still) possesses the client’s data. Constructions of PoR schemes attempt to minimize the client and server storage, the communication complexity of an audit, and even the number of file-blocks accessed by the server during the audit. In this work, we identify several different variants of the problem (such as bounded-use vs. unbounded-use, knowledge-soundness vs. information-soundness), and giving nearly optimal PoR schemes for each of these variants. Our constructions either improve (and generalize) the prior PoR constructions, or give the first known PoR schemes with the required properties. In particular, we

• Formally prove the security of an (optimized) variant of the bounded-use scheme of Juels and Kaliski [JK07], without making any simplifying assumptions on the behavior of the adversary.
• Build the first unbounded-use PoR scheme where the communication complexity is linear in the security parameter and which does not rely on Random Oracles, resolving an open question of Shacham and Waters [SW08].
• Build the first bounded-use scheme with information-theoretic security.

The main insight of our work comes from a simple connection between PoR schemes and the notion of hardness amplification, extensively studied in complexity theory. In particular, our im- provements come from first abstracting a purely information-theoretic notion of PoR codes, and then building nearly optimal PoR codes using state-of-the-art tools from coding and complexity theory.

Dwork, Cynthia, Moni Naor, Omer Reingold, Guy Rothblum, and Salil Vadhan. “On the complexity of differentially private data release: Efficient algorithms and hardness results.” In Proceedings of the 41st Annual ACM Symposium on Theory of Computing (STOC ‘09), 381-390. ACM, 2009. Publisher's VersionAbstract
We consider private data analysis in the setting in which a trusted and trustworthy curator, having obtained a large data set containing private information, releases to the public a "sanitization" of the data set that simultaneously protects the privacy of the individual contributors of data and offers utility to the data analyst. The sanitization may be in the form of an arbitrary data structure, accompanied by a computational procedure for determining approximate answers to queries on the original data set, or it may be a "synthetic data set" consisting of data items drawn from the same universe as items in the original data set; queries are carried out as if the synthetic data set were the actual input. In either case the process is non-interactive; once the sanitization has been released the original data and the curator play no further role. For the task of sanitizing with a synthetic dataset output, we map the boundary between computational feasibility and infeasibility with respect to a variety of utility measures. For the (potentially easier) task of sanitizing with unrestricted output format, we show a tight qualitative and quantitative connection between hardness of sanitizing and the existence of traitor tracing schemes.
Haitner, Iftach, Omer Reingold, Salil Vadhan, and Hoeteck Wee. “Inaccessible entropy.” In Proceedings of the 41st Annual ACM Symposium on Theory of Computing (STOC ‘09), 611-620. ACM, 2009. Publisher's VersionAbstract

We put forth a new computational notion of entropy, which measures the (in)feasibility of sampling high entropy strings that are consistent with a given protocol. Specifically, we say that the $$i$$’th round of a protocol $$(\mathsf{A,B})$$ has accessible entropy at most $$k$$, if no polynomial-time strategy $$\mathsf{A}^*$$ can generate messages for $$\mathsf{A}$$ such that the entropy of its message in the $$i$$’th round has entropy greater than $$k$$ when conditioned both on prior messages of the protocol and on prior coin tosses of $$\mathsf{A}^*$$. We say that the protocol has inaccessible entropy if the total accessible entropy (summed over the rounds) is noticeably smaller than the real entropy of $$\mathsf{A}$$’s messages, conditioned only on prior messages (but not the coin tosses of $$\mathsf{A}$$). As applications of this notion, we

• Give a much simpler and more efficient construction of statistically hiding commitment schemes from arbitrary one- way functions.

• Prove that constant-round statistically hiding commitments are necessary for constructing constant-round zero-knowledge proof systems for NP that remain secure under parallel composition (assuming the existence of one-way functions).

Trevisan, Luca, Madhur Tulsiani, and Salil Vadhan. “Regularity, boosting, and efficiently simulating every high-entropy distribution.” In Proceedings of the 24th Annual IEEE Conference on Computational Complexity (CCC ‘09), 126-136. IEEE, 2009. Publisher's VersionAbstract

Version HistoryPreliminary version posted as ECCC TR08-103.

We show that every bounded function $${g}$$ : $$\{0,1\}^n \to [0,1]$$ admits an efficiently computable "simulator" function $${h}$$ : $$\{0,1\}^n \to [0,1]$$ such that every fixed polynomial size circuit has approximately the same correlation with $${g}$$ as with $${h}$$. If g describes (up to scaling) a high min-entropy distribution $$D$$, then $${h}$$ can be used to efficiently sample a distribution $$D'$$ of the same min-entropy that is indistinguishable from $$D$$ by circuits of fixed polynomial size. We state and prove our result in a more abstract setting, in which we allow arbitrary finite domains instead of $$\{0,1\}^n$$, and arbitrary families of distinguishers, instead of fixed polynomial size circuits. Our result implies (a) the weak Szemeredi regularity Lemma of Frieze and Kannan (b) a constructive version of the dense model theorem of Green, Tao and Ziegler with better quantitative parameters (polynomial rather than exponential in the distinguishing probability), and (c) the Impagliazzo hardcore set Lemma. It appears to be the general result underlying the known connections between "regularity" results in graph theory, "decomposition" results in additive combinatorics, and the hardcore Lemma in complexity theory. We present two proofs of our result, one in the spirit of Nisan's proof of the hardcore Lemma via duality of linear programming, and one similar to Impagliazzo's "boosting" proof. A third proof by iterative partitioning, which gives the complexity of the sampler to be exponential in the distinguishing probability, is also implicit in the Green-Tao-Ziegler proofs of the dense model theorem.

Mironov, Ilya, Omkant Pandey, Omer Reingold, and Salil Vadhan. “Computational differential privacy.” In S. Halevi, editor, Advances in Cryptology—CRYPTO ‘09, Lecture Notes in Computer Science, 5677:126-142. Springer-Verlag, 2009. Publisher's VersionAbstract

The definition of differential privacy has recently emerged as a leading standard of privacy guarantees for algorithms on statistical databases. We offer several relaxations of the definition which require privacy guarantees to hold only against efficient—i.e., computationally-bounded—adversaries. We establish various relationships among these notions, and in doing so, we observe their close connection with the theory of pseudodense sets by Reingold et al.[1]. We extend the dense model theorem of Reingold et al. to demonstrate equivalence between two definitions (indistinguishability-and simulatability-based) of computational differential privacy.

Our computational analogues of differential privacy seem to allow for more accurate constructions than the standard information-theoretic analogues. In particular, in the context of private approximation of the distance between two vectors, we present a differentially-private protocol for computing the approximation, and contrast it with a substantially more accurate protocol that is only computationally differentially private.