# The Assumptions for Cryptography (NSF CNS-0831289)

2020
Haitner, Iftach, Thomas Holenstein, Omer Reingold, Salil Vadhan, and Hoeteck Wee. “Inaccessible entropy II: IE functions and universal one-way hashing.” Theory of Computing 16, no. 8 (2020): 1-55. Publisher's VersionAbstract

Version History: published earlier in Henri Gilbert, ed., Advances in Cryptology—EUROCRYPT ‘10, Lecture Notes on Computer Science, as "Universal one-way hash functions via inaccessible entropy":

This paper revisits the construction of Universal One-Way Hash Functions (UOWHFs) from any one-way function due to Rompel (STOC 1990). We give a simpler construction of UOWHFs, which also obtains better efficiency and security. The construction exploits a strong connection to the recently introduced notion of inaccessible entropy (Haitner et al. STOC 2009). With this perspective, we observe that a small tweak of any one-way function $$f$$ is already a weak form of a UOWHF: Consider $$F(x', i)$$ that outputs the $$i$$-bit long prefix of $$f(x)$$. If $$F$$ were a UOWHF then given a random $$x$$ and $$i$$ it would be hard to come up with $$x' \neq x$$ such that $$F(x, i) = F(x', i)$$. While this may not be the case, we show (rather easily) that it is hard to sample $$x'$$ with almost full entropy among all the possible such values of $$x'$$. The rest of our construction simply amplifies and exploits this basic property.

With this and other recent works, we have that the constructions of three fundamental cryptographic primitives (Pseudorandom Generators, Statistically Hiding Commitments and UOWHFs) out of one-way functions are to a large extent unified. In particular, all three constructions rely on and manipulate computational notions of entropy in similar ways. Pseudorandom Generators rely on the well-established notion of pseudoentropy, whereas Statistically Hiding Commitments and UOWHFs rely on the newer notion of inaccessible entropy.

Ullman, Jon, and Salil Vadhan. “PCPs and the hardness of generating synthetic data.” Journal of Cryptology 33 (2020): 2078-2112. Publisher's VersionAbstract

Version HistoryFull version posted as ECCC TR10-017.

Published earlier in Yuval Ishai, ed., Proceedings of the 8th IACR Theory of Cryptography Conference (TCC ‘11), Lecture Notes on Computer Science. Springer-Verlag, Publishers: Vol. 5978, pp. 572-587. https://link.springer.com/chapter/10.1007/978-3-642-19571-6_24

Invited to J. Cryptology selected papers from TCC 2011.

Assuming the existence of one-way functions, we show that there is no polynomial-time, differentially private algorithm $$\mathcal{A}$$ that takes a database $$D ∈ ({0, 1}^d)^n$$ and outputs a “synthetic database” $$\hat{D}$$ all of whose two-way marginals are approximately equal to those of $$D$$. (A two-way marginal is the fraction of database rows $$x ∈ {0, 1}^d$$ with a given pair of values in a given pair of columns.) This answers a question of Barak et al. (PODS ‘07), who gave an algorithm running in time $$\mathrm{poly} (n, 2^d)$$.

Our proof combines a construction of hard-to-sanitize databases based on digital signatures (by Dwork et al., STOC ‘09) with encodings based on probabilistically checkable proofs.

We also present both negative and positive results for generating “relaxed” synthetic data, where the fraction of rows in $$D$$ satisfying a predicate $$c$$ are estimated by applying $$c$$ to each row of $$\hat{D}$$ and aggregating the results in some way.

2013
Haitner, Iftach, Omer Reingold, and Salil Vadhan. “Efficiency improvements in constructing pseudorandom generators from one-way functions.” SIAM Journal on Computing 42, no. 3 (2013): 1405-1430. Publisher's VersionAbstract

Version HistorySpecial Issue on STOC ‘10.

We give a new construction of pseudorandom generators from any one-way function. The construction achieves better parameters and is simpler than that given in the seminal work of Håstad, Impagliazzo, Levin, and Luby [SICOMP ’99]. The key to our construction is a new notion of next-block pseudoentropy, which is inspired by the notion of “in-accessible entropy” recently introduced in [Haitner, Reingold, Vadhan, and Wee, STOC ’09]. An additional advan- tage over previous constructions is that our pseudorandom generators are parallelizable and invoke the one-way function in a non-adaptive manner. Using [Applebaum, Ishai, and Kushilevitz, SICOMP ’06], this implies the existence of pseudorandom generators in NC$$^0$$ based on the existence of one-way functions in NC$$^1$$.

Reshef, Yakir, and Salil Vadhan. “On extractors and exposure-resilient functions for sublogarithmic entropy.” Random Structures & Algorithms 42, no. 3 (2013): 386-401. ArXiv VersionAbstract

Version HistoryPreliminary version posted as arXiv:1003.4029 (Dec. 2010).

We study resilient functions and exposure‐resilient functions in the low‐entropy regime. A resilient function (a.k.a. deterministic extractor for oblivious bit‐fixing sources) maps any distribution on n‐bit strings in which bits are uniformly random and the rest are fixed into an output distribution that is close to uniform. With exposure‐resilient functions, all the input bits are random, but we ask that the output be close to uniform conditioned on any subset of n – k input bits. In this paper, we focus on the case that k is sublogarithmic in n.

We simplify and improve an explicit construction of resilient functions for k sublogarithmic in n due to Kamp and Zuckerman (SICOMP 2006), achieving error exponentially small in krather than polynomially small in k. Our main result is that when k is sublogarithmic in n, the short output length of this construction ($$O(\log k)$$ output bits) is optimal for extractors computable by a large class of space‐bounded streaming algorithms.

Next, we show that a random function is a resilient function with high probability if and only if k is superlogarithmic in n, suggesting that our main result may apply more generally. In contrast, we show that a random function is a static (resp. adaptive) exposure‐resilient function with high probability even if k is as small as a constant (resp. log log n). No explicit exposure‐resilient functions achieving these parameters are known.

2012
Thaler, Justin, Jonathan Ullman, and Salil Vadhan. “Faster algorithms for privately releasing marginals.” In Artur Czumaj, Kurt Mehlhorn, Andrew M. Pitts, and Roger Wattenhofer, editors, Proceedings of the 39th International Colloquium on Automata, Languages, and Programming (ICALP ‘12), Lecture Notes on Computer Science, 7391:810-821. Springer-Verlag, 2012. Publisher's VersionAbstract

Version History: Full version posted as arXiv:1205.1758v2.

We study the problem of releasing k-way marginals of a database $$D ∈ ({0,1}^d)^n$$, while preserving differential privacy. The answer to a $$k$$-way marginal query is the fraction of $$D$$’s records $$x \in \{0,1\}^d$$ with a given value in each of a given set of up to $$k$$ columns. Marginal queries enable a rich class of statistical analyses of a dataset, and designing efficient algorithms for privately releasing marginal queries has been identified as an important open problem in private data analysis (cf. Barak et. al., PODS ’07).

We give an algorithm that runs in time $$d^{O(\sqrt{K})}$$ and releases a private summary capable of answering any $$k$$-way marginal query with at most ±.01 error on every query as long as $$n \geq d^{O(\sqrt{K})}$$. To our knowledge, ours is the first algorithm capable of privately releasing marginal queries with non-trivial worst-case accuracy guarantees in time substantially smaller than the number of k-way marginal queries, which is $$d^{\Theta(k)}$$ (for $$k ≪ d$$).

2011
Dvir, Zeev, Dan Gutfreund, Guy Rothblum, and Salil Vadhan. “On approximating the entropy of polynomial mappings.” In Proceedings of the Second Symposium on Innovations in Computer Science (ICS 2011), 460-475. Tsinghua University Press, 2011. Publisher's VersionAbstract

Version HistoryFull version posted as ECCC TR10-60.

We investigate the complexity of the following computational problem:

Polynomial Entropy Approximation (PEA): Given a low-degree polynomial mapping $$p : \mathbb{F}^n → \mathbb{F}^m$$, where F is a finite field, approximate the output entropy $$H(p(U_n))$$, where $$U_n$$ is the uniform distribution on $$\mathbb{F}^n$$ and $$H$$ may be any of several entropy measures.

We show:

• Approximating the Shannon entropy of degree 3 polynomials $$p : \mathbb{F}_2^n \to \mathbb{F}^m_2$$ over $$\mathbb{F}_2$$ to within an additive constant (or even $$n^.9$$) is complete for $$\mathbf{SZKP_L}$$, the class of problems having statistical zero-knowledge proofs where the honest verifier and its simulator are computable in logarithmic space. ($$\mathbf{SZKP_L}$$contains most of the natural problems known to be in the full class $$\mathbf{SZKP}$$.)

• For prime fields $$\mathbb{F} \neq \mathbb{F}_2$$ and homogeneous quadratic polynomials $$p : \mathbb{F}^n \to \mathbb{F}^m$$, there is a probabilistic polynomial-time algorithm that distinguishes the case that $$p(U_n)$$) has entropy smaller than k from the case that $$p(U_n))$$ has min-entropy (or even Renyi entropy) greater than $$(2 + o(1))k$$.

• For degree d polynomials $$p : \mathbb{F}^n_2 \to \mathbb{F}^m_2$$ , there is a polynomial-time algorithm that distinguishes the case that $$p(U_n)$$ has max-entropy smaller than $$k$$ (where the max-entropy of a random variable is the logarithm of its support size) from the case that $$p(U_n)$$ has max-entropy at least $$(1 + o(1)) \cdot k^d$$ (for fixed $$d$$ and large $$k$$).

Mahmoody, Mohammad, Tal Moran, and Salil Vadhan. “Time-lock puzzles in the random oracle model.” In P. Rogaway, editor, Advances in Cryptology—CRYPTO ‘11, Lecture Notes in Computer Science, 6841:39-50. Springer-Verlag, 2011. Publisher's VersionAbstract

A time-lock puzzle is a mechanism for sending messages “to the future”. The sender publishes a puzzle whose solution is the message to be sent, thus hiding it until enough time has elapsed for the puzzle to be solved. For time-lock puzzles to be useful, generating a puzzle should take less time than solving it. Since adversaries may have access to many more computers than honest solvers, massively parallel solvers should not be able to produce a solution much faster than serial ones.

To date, we know of only one mechanism that is believed to satisfy these properties: the one proposed by Rivest, Shamir and Wagner (1996), who originally introduced the notion of time-lock puzzles. Their puzzle is based on the serial nature of exponentiation and the hardness of factoring, and is therefore vulnerable to advances in factoring techniques (as well as to quantum attacks).

In this work, we study the possibility of constructing time-lock puzzles in the random-oracle model. Our main result is negative, ruling out time-lock puzzles that require more parallel time to solve than the total work required to generate a puzzle. In particular, this should rule out black-box constructions of such time-lock puzzles from one-way permutations and collision-resistant hash-functions. On the positive side, we construct a time-lock puzzle with a linear gap in parallel time: a new puzzle can be generated with one round of $${n}$$ parallel queries to the random oracle, but $${n}$$ rounds of serial queries are required to solve it (even for massively parallel adversaries).

2010
Rothblum, Guy, and Salil Vadhan. “Are PCPs inherent in efficient arguments?Computational Complexity 19, no. 2 (2010): 265-304. Publisher's VersionAbstract

Version HistorySpecial Issue on CCC '09.

Starting with Kilian (STOC ‘92), several works have shown how to use probabilistically checkable proofs (PCPs) and cryptographic primitives such as collision-resistant hashing to construct very efficient argument systems (a.k.a. computationally sound proofs), for example with polylogarithmic communication complexity. Ishai et al. (CCC ‘07) raised the question of whether PCPs are inherent in efficient arguments, and if so, to what extent. We give evidence that they are, by showing how to convert any argument system whose soundness is reducible to the security of some cryptographic primitive into a PCP system whose efficiency is related to that of the argument system and the reduction (under certain complexity assumptions).

Birrell, Eleanor, and Salil Vadhan. “Composition of zero-knowledge proofs with efficient provers.” In Daniele Micciancio, editor, Proceedings of the 7th IACR Theory of Cryptography Conference (TCC ‘10), Lecture Notes on Computer Science, 5978:572-587. Springer-Verlag, 2010. Publisher's VersionAbstract

We revisit the composability of different forms of zero- knowledge proofs when the honest prover strategy is restricted to be polynomial time (given an appropriate auxiliary input). Our results are:

1. When restricted to efficient provers, the original Goldwasser–Micali–Rackoff (GMR) definition of zero knowledge (STOC ‘85), here called plain zero knowledge, is closed under a constant number of sequential compositions (on the same input). This contrasts with the case of unbounded provers, where Goldreich and Krawczyk (ICALP ‘90, SICOMP ‘96) exhibited a protocol that is zero knowledge under the GMR definition, but for which the sequential composition of 2 copies is not zero knowledge.

2. If we relax the GMR definition to only require that the simulation is indistinguishable from the verifier’s view by uniform polynomial-time distinguishers, with no auxiliary input beyond the statement being proven, then again zero knowledge is not closed under sequential composition of 2 copies.

3. We show that auxiliary-input zero knowledge with efficient provers is not closed under parallel composition of 2 copies under the assumption that there is a secure key agreement protocol (in which it is easy to recognize valid transcripts). Feige and Shamir (STOC ‘90) gave similar results under the seemingly incomparable assumptions that (a) the discrete logarithm problem is hard, or (b) $$\mathcal{UP}\nsubseteq\mathcal{BPP}$$ and one-way functions exist.
Dwork, Cynthia, Guy Rothblum, and Salil Vadhan. “Boosting and differential privacy.” In Proceedings of the 51st Annual IEEE Symposium on Foundations of Computer Science (FOCS ‘10), 51-60. IEEE, 2010. Publisher's VersionAbstract
Boosting is a general method for improving the accuracy of learning algorithms. We use boosting to construct improved privacy-preserving synopses of an input database. These are data structures that yield, for a given set $$\mathcal{Q}$$ of queries over an input database, reasonably accurate estimates of the responses to every query in $$\mathcal{Q}$$, even when the number of queries is much larger than the number of rows in the database. Given a base synopsis generator that takes a distribution on $$\mathcal{Q}$$ and produces a "weak" synopsis that yields "good" answers for a majority of the weight in $$\mathcal{Q}$$, our Boosting for Queries algorithm obtains a synopsis that is good for all of $$\mathcal{Q}$$. We ensure privacy for the rows of the database, but the boosting is performed on the queries. We also provide the first synopsis generators for arbitrary sets of arbitrary low-sensitivity queries, i.e., queries whose answers do not vary much under the addition or deletion of a single row. In the execution of our algorithm certain tasks, each incurring some privacy loss, are performed many times. To analyze the cumulative privacy loss, we obtain an $$O(\varepsilon^2)$$ bound on the expected privacy loss from a single $$\varepsilon$$-differentially private mechanism. Combining this with evolution of confidence arguments from the literature, we get stronger bounds on the expected cumulative privacy loss due to multiple mechanisms, each of which provides $$\varepsilon$$-differential privacy or one of its relaxations, and each of which operates on (potentially) different, adaptively chosen, databases.
Chung, Kai-Min, Yael Kalai, and Salil Vadhan. “Improved delegation of computation using fully homomorphic encryption.” In T. Rabin, editor, Advances in Cryptology—CRYPTO ‘10, Lecture Notes in Computer Science, 6223:483-501. Springer-Verlag, 2010. Publisher's VersionAbstract

Version HistoryFull version posted as Cryptology ePrint Archive Report 210/241.

Following Gennaro, Gentry, and Parno (Cryptology ePrint Archive 2009/547), we use fully homomorphic encryption to design improved schemes for delegating computation. In such schemes, a delegator outsources the computation of a function $${F}$$ on many, dynamically chosen inputs $$x_i$$ to a worker in such a way that it is infeasible for the worker to make the delegator accept a result other than $${F}(x_i)$$. The “online stage” of the Gennaro et al. scheme is very efficient: the parties exchange two messages, the delegator runs in time poly$$(\log{T})$$, and the worker runs in time poly$$(T)$$, where $$T$$ is the time complexity of $$F$$. However, the “offline stage” (which depends on the function $$F$$ but not the inputs to be delegated) is inefficient: the delegator runs in time poly$$(T)$$ and generates a public key of length poly$$(T)$$ that needs to be accessed by the worker during the online stage.

Our first construction eliminates the large public key from the Gennaro et al. scheme. The delegator still invests poly$$(T)$$ time in the offline stage, but does not need to communicate or publish anything. Our second construction reduces the work of the delegator in the offline stage to poly$$(\log{T})$$ at the price of a 4-message (offline) interaction with a poly$$(T)$$-time worker (which need not be the same as the workers used in the online stage). Finally, we describe a “pipelined” implementation of the second construction that avoids the need to re-run the offline construction after errors are detected (assuming errors are not too frequent).

McGregor, Andrew, Ilya Mironov, Toniann Pitassi, Omer Reingold, Kunal Talwar, and Salil Vadhan. “The limits of two-party differential privacy.” In Proceedings of the 51st Annual IEEE Symposium on Foundations of Computer Science (FOCS ‘10), 81-90. IEEE, 2010. Publisher's VersionAbstract

Version History and Errata: Subsequent version published in ECCC 2011. Proposition 8 and Part (b) of Theorem 13 in the FOCS version are incorrect, and are removed from the ECCC version.

We study differential privacy in a distributed setting where two parties would like to perform analysis of their joint data while preserving privacy for both datasets. Our results imply almost tight lower bounds on the accuracy of such data analyses, both for specific natural functions (such as Hamming distance) and in general. Our bounds expose a sharp contrast between the two-party setting and the simpler client-server setting (where privacy guarantees are one-sided). In addition, those bounds demonstrate a dramatic gap between the accuracy that can be obtained by differentially private data analysis versus the accuracy obtainable when privacy is relaxed to a computational variant of differential privacy. The first proof technique we develop demonstrates a connection between differential privacy and deterministic extraction from Santha-Vazirani sources. A second connection we expose indicates that the ability to approximate a function by a low-error differentially private protocol is strongly related to the ability to approximate it by a low communication protocol. (The connection goes in both directions).

2009
Ong, Shien Jin, David Parkes, Alon Rosen, and Salil Vadhan. “Fairness with an honest minority and a rational majority.” In O. Reingold, editor, Proceedings of the Fourth Theory of Cryptography Conference (TCC ‘09), Lecture Notes in Computer Science, 5444:36-53. Springer-Verlag, 2009. Publisher's VersionAbstract

Version HistoryPreliminary version posted as Cryptology ePrint Archive Report 2008/097, March 2008.

We provide a simple protocol for secret reconstruction in any threshold secret sharing scheme, and prove that it is fair when executed with many rational parties together with a small minority of honest parties. That is, all parties will learn the secret with high probability when the honest parties follow the protocol and the rational parties act in their own self-interest (as captured by a set-Nash analogue of trembling hand perfect equilibrium). The protocol only requires a standard (synchronous) broadcast channel, tolerates both early stopping and incorrectly computed messages, and only requires 2 rounds of communication.

Previous protocols for this problem in the cryptographic or economic models have either required an honest majority, used strong communication channels that enable simultaneous exchange of information, or settled for approximate notions of security/equilibria. They all also required a nonconstant number of rounds of communication.

Dodis, Yevgeniy, Salil Vadhan, and Daniel Wichs. “Proofs of retrievability via hardness amplification.” In O. Reingold, editor, Proceedings of the Fourth Theory of Cryptography Conference (TCC ‘09), Lecture Notes in Computer Science, 5444:109-127. Springer-Verlag, 2009. Publisher's VersionAbstract

Version History: Originally presented at Theory of Cryptography Conference (TCC) 2009. Full version published in Cryptology ePrint Archive (attached as ePrint2009).

Proofs of Retrievability (PoR), introduced by Juels and Kaliski [JK07], allow the client to store a file F on an untrusted server, and later run an efficient audit protocol in which the server proves that it (still) possesses the client’s data. Constructions of PoR schemes attempt to minimize the client and server storage, the communication complexity of an audit, and even the number of file-blocks accessed by the server during the audit. In this work, we identify several different variants of the problem (such as bounded-use vs. unbounded-use, knowledge-soundness vs. information-soundness), and giving nearly optimal PoR schemes for each of these variants. Our constructions either improve (and generalize) the prior PoR constructions, or give the first known PoR schemes with the required properties. In particular, we

• Formally prove the security of an (optimized) variant of the bounded-use scheme of Juels and Kaliski [JK07], without making any simplifying assumptions on the behavior of the adversary.
• Build the first unbounded-use PoR scheme where the communication complexity is linear in the security parameter and which does not rely on Random Oracles, resolving an open question of Shacham and Waters [SW08].
• Build the first bounded-use scheme with information-theoretic security.

The main insight of our work comes from a simple connection between PoR schemes and the notion of hardness amplification, extensively studied in complexity theory. In particular, our im- provements come from first abstracting a purely information-theoretic notion of PoR codes, and then building nearly optimal PoR codes using state-of-the-art tools from coding and complexity theory.

Dwork, Cynthia, Moni Naor, Omer Reingold, Guy Rothblum, and Salil Vadhan. “On the complexity of differentially private data release: Efficient algorithms and hardness results.” In Proceedings of the 41st Annual ACM Symposium on Theory of Computing (STOC ‘09), 381-390. ACM, 2009. Publisher's VersionAbstract
We consider private data analysis in the setting in which a trusted and trustworthy curator, having obtained a large data set containing private information, releases to the public a "sanitization" of the data set that simultaneously protects the privacy of the individual contributors of data and offers utility to the data analyst. The sanitization may be in the form of an arbitrary data structure, accompanied by a computational procedure for determining approximate answers to queries on the original data set, or it may be a "synthetic data set" consisting of data items drawn from the same universe as items in the original data set; queries are carried out as if the synthetic data set were the actual input. In either case the process is non-interactive; once the sanitization has been released the original data and the curator play no further role. For the task of sanitizing with a synthetic dataset output, we map the boundary between computational feasibility and infeasibility with respect to a variety of utility measures. For the (potentially easier) task of sanitizing with unrestricted output format, we show a tight qualitative and quantitative connection between hardness of sanitizing and the existence of traitor tracing schemes.
Haitner, Iftach, Omer Reingold, Salil Vadhan, and Hoeteck Wee. “Inaccessible entropy.” In Proceedings of the 41st Annual ACM Symposium on Theory of Computing (STOC ‘09), 611-620. ACM, 2009. Publisher's VersionAbstract

We put forth a new computational notion of entropy, which measures the (in)feasibility of sampling high entropy strings that are consistent with a given protocol. Specifically, we say that the $$i$$’th round of a protocol $$(\mathsf{A,B})$$ has accessible entropy at most $$k$$, if no polynomial-time strategy $$\mathsf{A}^*$$ can generate messages for $$\mathsf{A}$$ such that the entropy of its message in the $$i$$’th round has entropy greater than $$k$$ when conditioned both on prior messages of the protocol and on prior coin tosses of $$\mathsf{A}^*$$. We say that the protocol has inaccessible entropy if the total accessible entropy (summed over the rounds) is noticeably smaller than the real entropy of $$\mathsf{A}$$’s messages, conditioned only on prior messages (but not the coin tosses of $$\mathsf{A}$$). As applications of this notion, we

• Give a much simpler and more efficient construction of statistically hiding commitment schemes from arbitrary one- way functions.

• Prove that constant-round statistically hiding commitments are necessary for constructing constant-round zero-knowledge proof systems for NP that remain secure under parallel composition (assuming the existence of one-way functions).