# Computational Complexity

Chung, Kai-Min, Omer Reingold, and Salil Vadhan. “S-T connectivity on digraphs with a known stationary distribution.” In ACM Transactions on Algorithms. Vol. 7. 3rd ed. ACM, 2011. Publisher's VersionAbstract

Version history: Preliminary versions in CCC '07 and on ECCC (TR07-030).

We present a deterministic logspace algorithm for solving S-T Connectivity on directed graphs if: (i) we are given a stationary distribution of the random walk on the graph in which both of the input vertices $$s$$ and $$t$$ have nonnegligible probability mass and (ii) the random walk which starts at the source vertex $$s$$ has polynomial mixing time. This result generalizes the recent deterministic logspace algorithm for S-T Connectivity on undirected graphs [Reingold, 2008]. It identifies knowledge of the stationary distribution as the gap between the S-T Connectivity problems we know how to solve in logspace (L) and those that capture all of randomized logspace (RL).

Dvir, Zeev, Dan Gutfreund, Guy Rothblum, and Salil Vadhan. “On approximating the entropy of polynomial mappings.” In Proceedings of the Second Symposium on Innovations in Computer Science (ICS 2011), 460-475. Tsinghua University Press, 2011. Publisher's VersionAbstract

Version HistoryFull version posted as ECCC TR10-60.

We investigate the complexity of the following computational problem:

Polynomial Entropy Approximation (PEA): Given a low-degree polynomial mapping $$p : \mathbb{F}^n → \mathbb{F}^m$$, where F is a finite field, approximate the output entropy $$H(p(U_n))$$, where $$U_n$$ is the uniform distribution on $$\mathbb{F}^n$$ and $$H$$ may be any of several entropy measures.

We show:

• Approximating the Shannon entropy of degree 3 polynomials $$p : \mathbb{F}_2^n \to \mathbb{F}^m_2$$ over $$\mathbb{F}_2$$ to within an additive constant (or even $$n^.9$$) is complete for $$\mathbf{SZKP_L}$$, the class of problems having statistical zero-knowledge proofs where the honest verifier and its simulator are computable in logarithmic space. ($$\mathbf{SZKP_L}$$contains most of the natural problems known to be in the full class $$\mathbf{SZKP}$$.)

• For prime fields $$\mathbb{F} \neq \mathbb{F}_2$$ and homogeneous quadratic polynomials $$p : \mathbb{F}^n \to \mathbb{F}^m$$, there is a probabilistic polynomial-time algorithm that distinguishes the case that $$p(U_n)$$) has entropy smaller than k from the case that $$p(U_n))$$ has min-entropy (or even Renyi entropy) greater than $$(2 + o(1))k$$.

• For degree d polynomials $$p : \mathbb{F}^n_2 \to \mathbb{F}^m_2$$ , there is a polynomial-time algorithm that distinguishes the case that $$p(U_n)$$ has max-entropy smaller than $$k$$ (where the max-entropy of a random variable is the logarithm of its support size) from the case that $$p(U_n)$$ has max-entropy at least $$(1 + o(1)) \cdot k^d$$ (for fixed $$d$$ and large $$k$$).

Mahmoody, Mohammad, Tal Moran, and Salil Vadhan. “Time-lock puzzles in the random oracle model.” In P. Rogaway, editor, Advances in Cryptology—CRYPTO ‘11, Lecture Notes in Computer Science, 6841:39-50. Springer-Verlag, 2011. Publisher's VersionAbstract

A time-lock puzzle is a mechanism for sending messages “to the future”. The sender publishes a puzzle whose solution is the message to be sent, thus hiding it until enough time has elapsed for the puzzle to be solved. For time-lock puzzles to be useful, generating a puzzle should take less time than solving it. Since adversaries may have access to many more computers than honest solvers, massively parallel solvers should not be able to produce a solution much faster than serial ones.

To date, we know of only one mechanism that is believed to satisfy these properties: the one proposed by Rivest, Shamir and Wagner (1996), who originally introduced the notion of time-lock puzzles. Their puzzle is based on the serial nature of exponentiation and the hardness of factoring, and is therefore vulnerable to advances in factoring techniques (as well as to quantum attacks).

In this work, we study the possibility of constructing time-lock puzzles in the random-oracle model. Our main result is negative, ruling out time-lock puzzles that require more parallel time to solve than the total work required to generate a puzzle. In particular, this should rule out black-box constructions of such time-lock puzzles from one-way permutations and collision-resistant hash-functions. On the positive side, we construct a time-lock puzzle with a linear gap in parallel time: a new puzzle can be generated with one round of $${n}$$ parallel queries to the random oracle, but $${n}$$ rounds of serial queries are required to solve it (even for massively parallel adversaries).

Rothblum, Guy, and Salil Vadhan. “Are PCPs inherent in efficient arguments?Computational Complexity 19, no. 2 (2010): 265-304. Publisher's VersionAbstract

Version HistorySpecial Issue on CCC '09.

Starting with Kilian (STOC ‘92), several works have shown how to use probabilistically checkable proofs (PCPs) and cryptographic primitives such as collision-resistant hashing to construct very efficient argument systems (a.k.a. computationally sound proofs), for example with polylogarithmic communication complexity. Ishai et al. (CCC ‘07) raised the question of whether PCPs are inherent in efficient arguments, and if so, to what extent. We give evidence that they are, by showing how to convert any argument system whose soundness is reducible to the security of some cryptographic primitive into a PCP system whose efficiency is related to that of the argument system and the reduction (under certain complexity assumptions).

Birrell, Eleanor, and Salil Vadhan. “Composition of zero-knowledge proofs with efficient provers.” In Daniele Micciancio, editor, Proceedings of the 7th IACR Theory of Cryptography Conference (TCC ‘10), Lecture Notes on Computer Science, 5978:572-587. Springer-Verlag, 2010. Publisher's VersionAbstract

We revisit the composability of different forms of zero- knowledge proofs when the honest prover strategy is restricted to be polynomial time (given an appropriate auxiliary input). Our results are:

1. When restricted to efficient provers, the original Goldwasser–Micali–Rackoff (GMR) definition of zero knowledge (STOC ‘85), here called plain zero knowledge, is closed under a constant number of sequential compositions (on the same input). This contrasts with the case of unbounded provers, where Goldreich and Krawczyk (ICALP ‘90, SICOMP ‘96) exhibited a protocol that is zero knowledge under the GMR definition, but for which the sequential composition of 2 copies is not zero knowledge.

2. If we relax the GMR definition to only require that the simulation is indistinguishable from the verifier’s view by uniform polynomial-time distinguishers, with no auxiliary input beyond the statement being proven, then again zero knowledge is not closed under sequential composition of 2 copies.

3. We show that auxiliary-input zero knowledge with efficient provers is not closed under parallel composition of 2 copies under the assumption that there is a secure key agreement protocol (in which it is easy to recognize valid transcripts). Feige and Shamir (STOC ‘90) gave similar results under the seemingly incomparable assumptions that (a) the discrete logarithm problem is hard, or (b) $$\mathcal{UP}\nsubseteq\mathcal{BPP}$$ and one-way functions exist.
Chung, Kai-Min, Yael Kalai, and Salil Vadhan. “Improved delegation of computation using fully homomorphic encryption.” In T. Rabin, editor, Advances in Cryptology—CRYPTO ‘10, Lecture Notes in Computer Science, 6223:483-501. Springer-Verlag, 2010. Publisher's VersionAbstract

Version HistoryFull version posted as Cryptology ePrint Archive Report 210/241.

Following Gennaro, Gentry, and Parno (Cryptology ePrint Archive 2009/547), we use fully homomorphic encryption to design improved schemes for delegating computation. In such schemes, a delegator outsources the computation of a function $${F}$$ on many, dynamically chosen inputs $$x_i$$ to a worker in such a way that it is infeasible for the worker to make the delegator accept a result other than $${F}(x_i)$$. The “online stage” of the Gennaro et al. scheme is very efficient: the parties exchange two messages, the delegator runs in time poly$$(\log{T})$$, and the worker runs in time poly$$(T)$$, where $$T$$ is the time complexity of $$F$$. However, the “offline stage” (which depends on the function $$F$$ but not the inputs to be delegated) is inefficient: the delegator runs in time poly$$(T)$$ and generates a public key of length poly$$(T)$$ that needs to be accessed by the worker during the online stage.

Our first construction eliminates the large public key from the Gennaro et al. scheme. The delegator still invests poly$$(T)$$ time in the offline stage, but does not need to communicate or publish anything. Our second construction reduces the work of the delegator in the offline stage to poly$$(\log{T})$$ at the price of a 4-message (offline) interaction with a poly$$(T)$$-time worker (which need not be the same as the workers used in the online stage). Finally, we describe a “pipelined” implementation of the second construction that avoids the need to re-run the offline construction after errors are detected (assuming errors are not too frequent).

McGregor, Andrew, Ilya Mironov, Toniann Pitassi, Omer Reingold, Kunal Talwar, and Salil Vadhan. “The limits of two-party differential privacy.” In Proceedings of the 51st Annual IEEE Symposium on Foundations of Computer Science (FOCS ‘10), 81-90. IEEE, 2010. Publisher's VersionAbstract

Version History and Errata: Subsequent version published in ECCC 2011. Proposition 8 and Part (b) of Theorem 13 in the FOCS version are incorrect, and are removed from the ECCC version.

We study differential privacy in a distributed setting where two parties would like to perform analysis of their joint data while preserving privacy for both datasets. Our results imply almost tight lower bounds on the accuracy of such data analyses, both for specific natural functions (such as Hamming distance) and in general. Our bounds expose a sharp contrast between the two-party setting and the simpler client-server setting (where privacy guarantees are one-sided). In addition, those bounds demonstrate a dramatic gap between the accuracy that can be obtained by differentially private data analysis versus the accuracy obtainable when privacy is relaxed to a computational variant of differential privacy. The first proof technique we develop demonstrates a connection between differential privacy and deterministic extraction from Santha-Vazirani sources. A second connection we expose indicates that the ability to approximate a function by a low-error differentially private protocol is strongly related to the ability to approximate it by a low communication protocol. (The connection goes in both directions).

Guruswami, Venkatesan, Christopher Umans, and Salil Vadhan. “Unbalanced expanders and randomness extractors from Parvaresh–Vardy codes.” Journal of the ACM 56, no. 4 (2009): 1–34. Publisher's VersionAbstract

Version History: Preliminary versions of this article appeared as Technical Report TR06-134 in Electronic Colloquium on Computational Complexity, 2006, and in Proceedings of the 22nd Annual IEEE Conference on Computional Complexity (CCC '07), pp. 96–108. Preliminary version recipient of Best Paper Award at CCC '07.

We give an improved explicit construction of highly unbalanced bipartite expander graphs with expansion arbitrarily close to the degree (which is polylogarithmic in the number of vertices). Both the degree and the number of right-hand vertices are polynomially close to optimal, whereas the previous constructions of Ta-Shma et al. [2007] required at least one of these to be quasipolynomial in the optimal. Our expanders have a short and self-contained description and analysis, based on the ideas underlying the recent list-decodable error-correcting codes of Parvaresh and Vardy [2005].

Our expanders can be interpreted as near-optimal “randomness condensers,” that reduce the task of extracting randomness from sources of arbitrary min-entropy rate to extracting randomness from sources of min-entropy rate arbitrarily close to 1, which is a much easier task. Using this connection, we obtain a new, self-contained construction of randomness extractors that is optimal up to constant factors, while being much simpler than the previous construction of Lu et al. [2003] and improving upon it when the error parameter is small (e.g., 1/poly(n)).

Haitner, Iftach, Minh Nguyen, Shien Jin Ong, Omer Reingold, and Salil Vadhan. “Statistically hiding commitments and statistical zero-knowledge arguments from any one-way function.” SIAM Journal on Computing 39, no. 3 (2009): 1153-1218. Publisher's VersionAbstract

Version HistorySpecial Issue on STOC ‘07. Merge of papers from FOCS ‘06 and STOC ‘07. Received SIAM Outstanding Paper Prize 2011.

We give a construction of statistically hiding commitment schemes (those in which the hiding property holds against even computationally unbounded adversaries) under the minimal complexity assumption that one-way functions exist. Consequently, one-way functions suffice to give statistical zero-knowledge arguments for any NP statement (whereby even a computationally unbounded adversarial verifier learns nothing other than the fact that the assertion being proven is true, and no polynomial-time adversarial prover can convince the verifier of a false statement). These results resolve an open question posed by Naor et al. [J. Cryptology, 11 (1998), pp. 87–108].

Dodis, Yevgeniy, Salil Vadhan, and Daniel Wichs. “Proofs of retrievability via hardness amplification.” In O. Reingold, editor, Proceedings of the Fourth Theory of Cryptography Conference (TCC ‘09), Lecture Notes in Computer Science, 5444:109-127. Springer-Verlag, 2009. Publisher's VersionAbstract

Version History: Originally presented at Theory of Cryptography Conference (TCC) 2009. Full version published in Cryptology ePrint Archive (attached as ePrint2009).

Proofs of Retrievability (PoR), introduced by Juels and Kaliski [JK07], allow the client to store a file F on an untrusted server, and later run an efficient audit protocol in which the server proves that it (still) possesses the client’s data. Constructions of PoR schemes attempt to minimize the client and server storage, the communication complexity of an audit, and even the number of file-blocks accessed by the server during the audit. In this work, we identify several different variants of the problem (such as bounded-use vs. unbounded-use, knowledge-soundness vs. information-soundness), and giving nearly optimal PoR schemes for each of these variants. Our constructions either improve (and generalize) the prior PoR constructions, or give the first known PoR schemes with the required properties. In particular, we

• Formally prove the security of an (optimized) variant of the bounded-use scheme of Juels and Kaliski [JK07], without making any simplifying assumptions on the behavior of the adversary.
• Build the first unbounded-use PoR scheme where the communication complexity is linear in the security parameter and which does not rely on Random Oracles, resolving an open question of Shacham and Waters [SW08].
• Build the first bounded-use scheme with information-theoretic security.

The main insight of our work comes from a simple connection between PoR schemes and the notion of hardness amplification, extensively studied in complexity theory. In particular, our im- provements come from first abstracting a purely information-theoretic notion of PoR codes, and then building nearly optimal PoR codes using state-of-the-art tools from coding and complexity theory.

Dwork, Cynthia, Moni Naor, Omer Reingold, Guy Rothblum, and Salil Vadhan. “On the complexity of differentially private data release: Efficient algorithms and hardness results.” In Proceedings of the 41st Annual ACM Symposium on Theory of Computing (STOC ‘09), 381-390. ACM, 2009. Publisher's VersionAbstract
We consider private data analysis in the setting in which a trusted and trustworthy curator, having obtained a large data set containing private information, releases to the public a "sanitization" of the data set that simultaneously protects the privacy of the individual contributors of data and offers utility to the data analyst. The sanitization may be in the form of an arbitrary data structure, accompanied by a computational procedure for determining approximate answers to queries on the original data set, or it may be a "synthetic data set" consisting of data items drawn from the same universe as items in the original data set; queries are carried out as if the synthetic data set were the actual input. In either case the process is non-interactive; once the sanitization has been released the original data and the curator play no further role. For the task of sanitizing with a synthetic dataset output, we map the boundary between computational feasibility and infeasibility with respect to a variety of utility measures. For the (potentially easier) task of sanitizing with unrestricted output format, we show a tight qualitative and quantitative connection between hardness of sanitizing and the existence of traitor tracing schemes.
Haitner, Iftach, Omer Reingold, Salil Vadhan, and Hoeteck Wee. “Inaccessible entropy.” In Proceedings of the 41st Annual ACM Symposium on Theory of Computing (STOC ‘09), 611-620. ACM, 2009. Publisher's VersionAbstract

We put forth a new computational notion of entropy, which measures the (in)feasibility of sampling high entropy strings that are consistent with a given protocol. Specifically, we say that the $$i$$’th round of a protocol $$(\mathsf{A,B})$$ has accessible entropy at most $$k$$, if no polynomial-time strategy $$\mathsf{A}^*$$ can generate messages for $$\mathsf{A}$$ such that the entropy of its message in the $$i$$’th round has entropy greater than $$k$$ when conditioned both on prior messages of the protocol and on prior coin tosses of $$\mathsf{A}^*$$. We say that the protocol has inaccessible entropy if the total accessible entropy (summed over the rounds) is noticeably smaller than the real entropy of $$\mathsf{A}$$’s messages, conditioned only on prior messages (but not the coin tosses of $$\mathsf{A}$$). As applications of this notion, we

• Give a much simpler and more efficient construction of statistically hiding commitment schemes from arbitrary one- way functions.

• Prove that constant-round statistically hiding commitments are necessary for constructing constant-round zero-knowledge proof systems for NP that remain secure under parallel composition (assuming the existence of one-way functions).

Trevisan, Luca, Madhur Tulsiani, and Salil Vadhan. “Regularity, boosting, and efficiently simulating every high-entropy distribution.” In Proceedings of the 24th Annual IEEE Conference on Computational Complexity (CCC ‘09), 126-136. IEEE, 2009. Publisher's VersionAbstract

Version HistoryPreliminary version posted as ECCC TR08-103.

We show that every bounded function $${g}$$ : $$\{0,1\}^n \to [0,1]$$ admits an efficiently computable "simulator" function $${h}$$ : $$\{0,1\}^n \to [0,1]$$ such that every fixed polynomial size circuit has approximately the same correlation with $${g}$$ as with $${h}$$. If g describes (up to scaling) a high min-entropy distribution $$D$$, then $${h}$$ can be used to efficiently sample a distribution $$D'$$ of the same min-entropy that is indistinguishable from $$D$$ by circuits of fixed polynomial size. We state and prove our result in a more abstract setting, in which we allow arbitrary finite domains instead of $$\{0,1\}^n$$, and arbitrary families of distinguishers, instead of fixed polynomial size circuits. Our result implies (a) the weak Szemeredi regularity Lemma of Frieze and Kannan (b) a constructive version of the dense model theorem of Green, Tao and Ziegler with better quantitative parameters (polynomial rather than exponential in the distinguishing probability), and (c) the Impagliazzo hardcore set Lemma. It appears to be the general result underlying the known connections between "regularity" results in graph theory, "decomposition" results in additive combinatorics, and the hardcore Lemma in complexity theory. We present two proofs of our result, one in the spirit of Nisan's proof of the hardcore Lemma via duality of linear programming, and one similar to Impagliazzo's "boosting" proof. A third proof by iterative partitioning, which gives the complexity of the sampler to be exponential in the distinguishing probability, is also implicit in the Green-Tao-Ziegler proofs of the dense model theorem.

Mironov, Ilya, Omkant Pandey, Omer Reingold, and Salil Vadhan. “Computational differential privacy.” In S. Halevi, editor, Advances in Cryptology—CRYPTO ‘09, Lecture Notes in Computer Science, 5677:126-142. Springer-Verlag, 2009. Publisher's VersionAbstract

The definition of differential privacy has recently emerged as a leading standard of privacy guarantees for algorithms on statistical databases. We offer several relaxations of the definition which require privacy guarantees to hold only against efficient—i.e., computationally-bounded—adversaries. We establish various relationships among these notions, and in doing so, we observe their close connection with the theory of pseudodense sets by Reingold et al.[1]. We extend the dense model theorem of Reingold et al. to demonstrate equivalence between two definitions (indistinguishability-and simulatability-based) of computational differential privacy.

Our computational analogues of differential privacy seem to allow for more accurate constructions than the standard information-theoretic analogues. In particular, in the context of private approximation of the distance between two vectors, we present a differentially-private protocol for computing the approximation, and contrast it with a substantially more accurate protocol that is only computationally differentially private.

Lovett, Shachar, Omer Reingold, Luca Trevisan, and Salil Vadhan. “Pseudorandom bit generators that fool modular sums.” In Proceedings of the 13th International Workshop on Randomization and Computation (RANDOM ‘09), Lecture Notes in Computer Science, 5687:615-630. Springer-Verlag, 2009. Publisher's VersionAbstract

We consider the following problem: for given $$n, M,$$ produce a sequence $$X_1, X_2, . . . , X_n$$ of bits that fools every linear test modulo $$M$$. We present two constructions of generators for such sequences. For every constant prime power $$M$$, the first construction has seed length $$O_M(\log(n/\epsilon))$$, which is optimal up to the hidden constant. (A similar construction was independently discovered by Meka and Zuckerman [MZ]). The second construction works for every $$M, n,$$ and has seed length $$O(\log n + \log(M/\epsilon) \log( M \log(1/\epsilon)))$$.

The problem we study is a generalization of the problem of constructing small bias distributions [NN], which are solutions to the $$M=2$$ case. We note that even for the case $$M=3$$ the best previously known con- structions were generators fooling general bounded-space computations, and required $$O(\log^2 n)$$ seed length.

For our first construction, we show how to employ recently constructed generators for sequences of elements of $$\mathbb{Z}_M$$ that fool small-degree polynomials modulo $$M$$. The most interesting technical component of our second construction is a variant of the derandomized graph squaring operation of [RV]. Our generalization handles a product of two distinct graphs with distinct bounds on their expansion. This is then used to produce pseudorandom walks where each step is taken on a different regular directed graph (rather than pseudorandom walks on a single regular directed graph as in [RTV, RV]).