# Pseudorandomness

Doron, Dean, Jack Murtagh, Salil Vadhan, and David Zuckerman. Spectral sparsification via bounded-independence sampling. Electronic Colloquium on Computational Complexity (ECCC), TR20-026, 2020. Publisher's VersionAbstract

Version History:

v1, 26 Feb 2020: https://arxiv.org/abs/2002.11237

We give a deterministic, nearly logarithmic-space algorithm for mild spectral sparsification of undirected graphs. Given a weighted, undirected graph $$G$$ on $$n$$ vertices described by a binary string of length $$N$$, an integer $$k \leq \log n$$ and an error parameter $$\varepsilon > 0$$, our algorithm runs in space $$\tilde{O}(k \log(N ^. w_{max}/w_{min}))$$ where $$w_{max}$$ and $$w_{min}$$ are the maximum and minimum edge weights in $$G$$, and produces a weighted graph $$H$$ with $$\tilde{O}(n^{1+2/k} / \varepsilon^2)$$expected edges that spectrally approximates $$G$$, in the sense of Spielmen and Teng [ST04], up to an error of $$\varepsilon$$.

Our algorithm is based on a new bounded-independence analysis of Spielman and Srivastava's effective resistance based edge sampling algorithm [SS08] and uses results from recent work on space-bounded Laplacian solvers [MRSV17]. In particular, we demonstrate an inherent tradeoff (via upper and lower bounds) between the amount of (bounded) independence used in the edge sampling algorithm, denoted by $$k$$ above, and the resulting sparsity that can be achieved.

Vadhan, Salil. “Computational entropy.” In Providing Sound Foundations for Cryptography: On the Work of Shafi Goldwasser and Silvio Micali (Oded Goldreich, Ed.), 693-726. ACM, 2019. Publisher's VersionAbstract

In this survey, we present several computational analogues of entropy and illustrate how they are useful for constructing cryptographic primitives. Specifically, we focus on constructing pseudorandom generators and statistically hiding commitments from arbitrary one-way functions, and demonstrate that:

1. The security properties of these (and other) cryptographic primitives can be understood in terms of various computational analogues of entropy, and in particular how these computational measures of entropy can be very different from real, information-theoretic entropy.
2. It can be shown that every one-way function directly exhibits some gaps between real entropy and the various computational entropies.
3. Thus we can construct the desired cryptographic primitives by amplifying and manipulating the entropy gaps in a one-way function, through forms of repetition and hashing.

The constructions we present (which are from the past decade) are much simpler and more efficient than the original ones, and are based entirely on natural manipulations of new notions of computational entropy. The two constructions are "dual" to each other, whereby the construction of pseudorandom generators relies on a form of computational entropy ("pseudoentropy") being larger than the real entropy, while the construction of statistically hiding commitments relies on a form of computational entropy ("accessible entropy") being smaller than the real entropy. Beyond that difference, the two constructions share a common structure, using a very similar sequence of manipulations of real and computational entropy. As a warmup, we also "deconstruct" the classic construction of pseudorandom generators from one-way permutations using the modern language of computational entropy.

This survey is written in honor of Shafi Goldwasser and Silvio Micali.

Ahmadinejad, AmirMahdi, Jonathan Kelner, Jack Murtagh, John Peebles, Aaron Sidford, and Salil Vadhan. “High-precision estimation of random walks in small space.” arXiv: 1912.04525 [cs.CC], 2019 (2019). ArXiv VersionAbstract
In this paper, we provide a deterministic $$\tilde{O}(\log N)$$-space algorithm for estimating the random walk probabilities on Eulerian directed graphs (and thus also undirected graphs) to within inverse polynomial additive error $$(ϵ = 1/\mathrm{poly}(N))$$ where $$N$$ is the length of the input. Previously, this problem was known to be solvable by a randomized algorithm using space $$O (\log N)$$ (Aleliunas et al., FOCS '79) and by a deterministic algorithm using space $$O (\log^{3/2} N)$$ (Saks and Zhou, FOCS '95 and JCSS '99), both of which held for arbitrary directed graphs but had not been improved even for undirected graphs. We also give improvements on the space complexity of both of these previous algorithms for non-Eulerian directed graphs when the error is negligible $$(ϵ=1/N^{ω(1)})$$, generalizing what Hoza and Zuckerman (FOCS '18) recently showed for the special case of distinguishing whether a random walk probability is 0 or greater than ϵ.
We achieve these results by giving new reductions between powering Eulerian random-walk matrices and inverting Eulerian Laplacian matrices, providing a new notion of spectral approximation for Eulerian graphs that is preserved under powering, and giving the first deterministic $$\tilde{O}(\log N)$$-space algorithm for inverting Eulerian Laplacian matrices. The latter algorithm builds on the work of Murtagh et al. (FOCS '17) that gave a deterministic $$\tilde{O}(\log N)$$-space algorithm for inverting undirected Laplacian matrices, and the work of Cohen et al. (FOCS '19) that gave a randomized $$\tilde{O} (N)$$-time algorithm for inverting Eulerian Laplacian matrices. A running theme throughout these contributions is an analysis of "cycle-lifted graphs," where we take a graph and "lift" it to a new graph whose adjacency matrix is the tensor product of the original adjacency matrix and a directed cycle (or variants of one).
Murtagh, Jack, Omer Reingold, Aaron Sidford, and Salil Vadhan. “Deterministic approximation of random walks in small space.” In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019), Dimitris Achlioptas and László A. Végh (Eds.). Vol. 145. Cambridge, Massachusetts (MIT) : Leibniz International Proceedings in Informatics (LIPIcs), 2019. Publisher's VersionAbstract
Version History: v1, 15 Mar. 2019: https://arxiv.org/abs/1903.06361v1
v2 in ArXiv, 25 Nov. 2019: https://arxiv.org/abs/1903.06361v2

Publisher's Version (APPROX-RANDOM 2019), 20 Sep 2019:

We give a deterministic, nearly logarithmic-space algorithm that given an undirected graph $$G$$, a positive integer $$r$$, and a set $$S$$ of vertices, approximates the conductance of $$S$$ in the $$r$$-step random walk on $$G$$ to within a factor of $$1+ϵ$$, where $$ϵ > 0$$ is an arbitrarily small constant. More generally, our algorithm computes an $$ϵ$$-spectral approximation to the normalized Laplacian of the $$r$$-step walk. Our algorithm combines the derandomized square graph operation (Rozenman and Vadhan, 2005), which we recently used for solving Laplacian systems in nearly logarithmic space (Murtagh, Reingold, Sidford, and Vadhan, 2017), with ideas from (Cheng, Cheng, Liu, Peng, and Teng, 2015), which gave an algorithm that is time-efficient (while ours is space-efficient) and randomized (while ours is deterministic) for the case of even $$r$$ (while ours works for all $$r$$). Along the way, we provide some new results that generalize technical machinery and yield improvements over previous work. First, we obtain a nearly linear-time randomized algorithm for computing a spectral approximation to the normalized Laplacian for odd $$r$$. Second, we define and analyze a generalization of the derandomized square for irregular graphs and for sparsifying the product of two distinct graphs. As part of this generalization, we also give a strongly explicit construction of expander graphs of every size.

Agrawal, Rohit, Yi-Hsiu Chen, Thibaut Horel, and Salil Vadhan. “Unifying computational entropies via Kullback-Leibler divergence.” In Advances in Cryptology: CRYPTO 2019, A. Boldyreva and D. Micciancio, (Eds), 11693:831-858. Springer Verlag, Lecture Notes in Computer Science, 2019. Publisher's VersionAbstract
Version History:
arXiv, first posted Feb 2019, most recently updated Aug 2019: https://arxiv.org/abs/1902.11202

We introduce hardness in relative entropy, a new notion of hardness for search problems which on the one hand is satisfied by all one-way functions and on the other hand implies both next-block pseudoentropy and inaccessible entropy, two forms of computational entropy used in recent constructions of pseudorandom generators and statistically hiding commitment schemes, respectively. Thus, hardness in relative entropy unifies the latter two notions of computational entropy and sheds light on the apparent “duality” between them. Additionally, it yields a more modular and illuminating proof that one-way functions imply next-block inaccessible entropy, similar in structure to the proof that one-way functions imply next-block pseudoentropy (Vadhan and Zheng, STOC ‘12).
Raghunathan, Ananth, Gil Segev, and Salil P. Vadhan. “Deterministic public-key encryption for adaptively-chosen plaintext distributions.” Journal of Cryptology 31, no. 4 (2018): 1012-1063. Publisher's VersionAbstract

Version History: Preliminary versions in EUROCRYPT ‘13 and Cryptology ePrint report 2013/125.

Bellare, Boldyreva, and O’Neill (CRYPTO '07) initiated the study of deterministic public-key encryption as an alternative in scenarios where randomized encryption has inherent drawbacks. The resulting line of research has so far guaranteed security only for adversarially-chosen plaintext distributions that are independent of the public key used by the scheme. In most scenarios, however, it is typically not realistic to assume that adversaries do not take the public key into account when attacking a scheme.

We show that it is possible to guarantee meaningful security even for plaintext distributions that depend on the public key. We extend the previously proposed notions of security, allowing adversaries to adaptively choose plaintext distributions after seeing the public key, in an interactive manner. The only restrictions we make are that: (1) plaintext distributions are unpredictable (as is essential in deterministic public-key encryption), and (2) the number of plaintext distributions from which each adversary is allowed to adaptively choose is upper bounded by $$2^p$$, where $$p$$ can be any predetermined polynomial in the security parameter. For example, with $$p=0$$ we capture plaintext distributions that are independent of the public key, and with $$p=0(s \log s)$$ we capture, in particular, all plaintext distributions that are samplable by circuits of size $$s$$.

Within our framework we present both constructions in the random-oracle model based on any public-key encryption scheme, and constructions in the standard model based on lossy trapdoor functions (thus, based on a variety of number-theoretic assumptions). Previously known constructions heavily relied on the independence between the plaintext distributions and the public key for the purposes of randomness extraction. In our setting, however, randomness extraction becomes significantly more challenging once the plaintext distributions and the public key are no longer independent. Our approach is inspired by research on randomness extraction from seed-dependent distributions. Underlying our approach is a new generalization of a method for such randomness extraction, originally introduced by Trevisan and Vadhan (FOCS '00) and Dodis (PhD Thesis, MIT, '00).

Chen, Yi-Hsiu, Mika Goos, Salil P. Vadhan, and Jiapeng Zhang. “A tight lower bound for entropy flattening.” In 33rd Computational Complexity Conference (CCC 2018), 102:23:21-23:28. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik: Leibniz International Proceedings in Informatics (LIPIcs), 2018. Publisher's VersionAbstract

Version History: Preliminary version posted as ECCC TR18-119.

We study entropy flattening: Given a circuit $$C_X$$ implicitly describing an n-bit source $$X$$ (namely, $$X$$ is the output of $$C_X$$  on a uniform random input), construct another circuit $$C_Y$$ describing a source $$Y$$ such that (1) source $$Y$$ is nearly flat (uniform on its support), and (2) the Shannon entropy of $$Y$$ is monotonically related to that of $$X$$. The standard solution is to have $$C_Y$$ evaluate $$C_X$$ altogether $$\Theta(n^2)$$ times on independent inputs and concatenate the results (correctness follows from the asymptotic equipartition property). In this paper, we show that this is optimal among black-box constructions: Any circuit $$C_Y$$ for entropy flattening that repeatedly queries $$C_X$$ as an oracle requires $$\Omega(n^2)$$queries.

Entropy flattening is a component used in the constructions of pseudorandom generators and other cryptographic primitives from one-way functions [12, 22, 13, 6, 11, 10, 7, 24]. It is also used in reductions between problems complete for statistical zero-knowledge [19, 23, 4, 25]. The $$\Theta(n^2)$$ query complexity is often the main efficiency bottleneck. Our lower bound can be viewed as a step towards proving that the current best construction of pseudorandom generator from arbitrary one-way functions by Vadhan and Zheng (STOC 2012) has optimal efficiency.

Haitner, Iftach, and Salil Vadhan. “The Many Entropies in One-way Functions.” In Tutorials on the Foundations of Cryptography, 159-217. Springer, Yehuda Lindell, ed. 2017. Publisher's VersionAbstract

Version History:

Earlier versions: May 2017: ECCC TR 17-084

Dec. 2017: ECCC TR 17-084 (revised)

Computational analogues of information-theoretic notions have given rise to some of the most interesting phenomena in the theory of computation. For example, computational indistinguishability, Goldwasser and Micali [9], which is the computational analogue of statistical distance, enabled the bypassing of Shannon’s impossibility results on perfectly secure encryption, and provided the basis for the computational theory of pseudorandomness. Pseudoentropy, Håstad, Impagliazzo, Levin, and Luby [17], a computational analogue of entropy, was the key to the fundamental result establishing the equivalence of pseudorandom generators and one-way functions, and has become a basic concept in complexity theory and cryptography.

This tutorial discusses two rather recent computational notions of entropy, both of which can be easily found in any one-way function, the most basic cryptographic primitive. The first notion is next-block pseudoentropy, Haitner, Reingold, and Vadhan [14], a refinement of pseudoentropy that enables simpler and more ecient construction of pseudorandom generators. The second is inaccessible entropy, Haitner, Reingold, Vadhan, andWee [11], which relates to unforgeability and is used to construct simpler and more efficient universal one-way hash functions and statistically hiding commitments.

Steinke, Thomas, Salil Vadhan, and Andrew Wan. “Pseudorandomness and Fourier growth bounds for width 3 branching programs.” Theory of Computing – Special Issue on APPROX-RANDOM 2014 13, no. 12 (2017): 1-50. Publisher's VersionAbstract

Version History: a conference version of this paper appeared in the Proceedings of the 18th International Workshop on Randomization and Computation (RANDOM'14). Full version posted as ECCC TR14-076 and arXiv:1405.7028 [cs.CC].

We present an explicit pseudorandom generator for oblivious, read-once, width-3 branching programs, which can read their input bits in any order. The generator has seed length $$Õ(\log^3 n)$$.The previously best known seed length for this model is $$n^{1/2+o(1)}$$ due to Impagliazzo, Meka, and Zuckerman (FOCS ’12). Our work generalizes a recent result of Reingold, Steinke, and Vadhan (RANDOM ’13) for permutation branching programs. The main technical novelty underlying our generator is a new bound on the Fourier growth of width-3, oblivious, read-once branching programs. Specifically, we show that for any $$f : \{0, 1\}^n → \{0, 1\}$$ computed by such a branching program, and $$k ∈ [n]$$,

$$\displaystyle\sum_{s⊆[n]:|s|=k} \big| \hat{f}[s] \big | ≤n^2 ·(O(\log n))^k$$,

where $$\hat{f}[s] = \mathbb{E}_U [f[U] \cdot (-1)^{s \cdot U}]$$ is the standard Fourier transform over $$\mathbb{Z}^n_2$$. The base $$O(\log n)$$ of the Fourier growth is tight up to a factor of $$\log \log n$$.

Vadhan., Salil P.On learning vs. refutation.” 30th Conference on Learning Theory (COLT 17), 2017, 65, 1835-1848. Publisher's VersionAbstract
Building on the work of Daniely et al. (STOC 2014, COLT 2016), we study the connection between computationally efficient PAC learning and refutation of constraint satisfaction problems. Specifically, we prove that for every concept class $$\mathcal{P }$$ , PAC-learning $$\mathcal{P}$$ is polynomially equivalent to “random-right-hand-side-refuting” (“RRHS-refuting”) a dual class $$\mathcal{P}^∗$$, where RRHS-refutation of a class $$Q$$ refers to refuting systems of equations where the constraints are (worst-case) functions from the class $$Q$$ but the right-hand-sides of the equations are uniform and independent random bits. The reduction from refutation to PAC learning can be viewed as an abstraction of (part of) the work of Daniely, Linial, and Shalev-Schwartz (STOC 2014). The converse, however, is new, and is based on a combination of techniques from pseudorandomness (Yao ‘82) with boosting (Schapire ‘90). In addition, we show that PAC-learning the class of $$DNF$$ formulas is polynomially equivalent to PAC-learning its dual class $$DNF ^∗$$ , and thus PAC-learning $$DNF$$ is equivalent to RRHS-refutation of $$DNF$$ , suggesting an avenue to obtain stronger lower bounds for PAC-learning $$DNF$$ than the quasipolynomial lower bound that was obtained by Daniely and Shalev-Schwartz (COLT 2016) assuming the hardness of refuting $$k$$-SAT.
Murtagh, Jack, Omer Reingold, Aaron Sidford, and Salil Vadhan. “Derandomization beyond connectivity: Undirected Laplacian systems in nearly logarithmic space.58th Annual IEEE Symposium on Foundations of Computer Science (FOCS 17), 2017. Publisher's VersionAbstract
Version History
ArXiv, 15 August 2017 https://arxiv.org/abs/1708.04634

We give a deterministic $$\overline{O} (\log n)$$-space algorithm for approximately solving linear systems given by Laplacians of undirected graphs, and consequently also approximating hitting times, commute times, and escape probabilities for undirected graphs. Previously, such systems were known to be solvable by randomized algorithms using $$O(\log n)$$ space (Doron, Le Gall, and Ta-Shma, 2017) and hence by deterministic algorithms using $$O(\log^{3/2} n)$$  space (Saks and Zhou, FOCS 1995 and JCSS 1999).

Our algorithm combines ideas from time-efficient Laplacian solvers (Spielman and Teng, STOC '04; Peng and Spielman, STOC '14) with ideas used to show that Undirected S-T Connectivity is in deterministic logspace (Reingold, STOC '05 and JACM '08; Rozenman and Vadhan, RANDOM '05).
Chen, Yi-Hsiu, Kai-Min Chung, Ching-Yi Lai, Salil P. Vadhan, and Xiaodi Wu.Computational notions of quantum min-entropy.” In Poster presention at QIP 2017 and oral presentation at QCrypt 2017, 2017. Publisher's VersionAbstract

Version History

ArXiv v1, 24 April 2017 https://arxiv.org/abs/1704.07309v1
ArXiv v2, 25 April 2017 https://arxiv.org/abs/1704.07309v2
ArXiv v3, 9 September 2017 https://arxiv.org/abs/1704.07309v3
ArXiv v4, 5 October 2017 https://arxiv.org/abs/1704.07309v4

We initiate the study of computational entropy in the quantum setting. We investigate to what extent the classical notions of computational entropy generalize to the quantum setting, and whether quantum analogues of classical theorems hold. Our main results are as follows. (1) The classical Leakage Chain Rule for pseudoentropy can be extended to the case that the leakage information is quantum (while the source remains classical). Specifically, if the source has pseudoentropy at least $$k$$, then it has pseudoentropy at least $$k−ℓ$$ conditioned on an $$ℓ$$-qubit leakage. (2) As an application of the Leakage Chain Rule, we construct the first quantum leakage-resilient stream-cipher in the bounded-quantum-storage model, assuming the existence of a quantum-secure pseudorandom generator. (3) We show that the general form of the classical Dense Model Theorem (interpreted as the equivalence between two definitions of pseudo-relative-min-entropy) does not extend to quantum states. Along the way, we develop quantum analogues of some classical techniques (e.g. the Leakage Simulation Lemma, which is proven by a Non-uniform Min-Max Theorem or Boosting). On the other hand, we also identify some classical techniques (e.g. Gap Amplification) that do not work in the quantum setting. Moreover, we introduce a variety of notions that combine quantum information and quantum complexity, and this raises several directions for future work.

Chen, Sitan, Thomas Steinke, and Salil P. Vadhan. “Pseudorandomness for read-once, constant-depth circuits.” CoRR, 2015, 1504.04675. Publisher's VersionAbstract

For Boolean functions computed by read-once, depth-D circuits with unbounded fan-in over the de Morgan basis, we present an explicit pseudorandom generator with seed length $$\tilde{O}(\log^{D+1} n)$$. The previous best seed length known for this model was $$\tilde{O}(\log^{D+4} n)$$, obtained by Trevisan and Xue (CCC ‘13) for all of AC0 (not just read-once). Our work makes use of Fourier analytic techniques for pseudorandomness introduced by Reingold, Steinke, and Vadhan (RANDOM ‘13) to show that the generator of Gopalan et al. (FOCS ‘12) fools read-once AC0. To this end, we prove a new Fourier growth bound for read-once circuits, namely that for every $$F : \{0,1\}^n\rightarrow \{0,1\}$$ computed by a read-once, depth-$$D$$ circuit,

$$\left|\hat{F}[s]\right| \leq O\left(\log^{D-1} n\right)^k,$$

where $$\hat{F}$$ denotes the Fourier transform of $$F$$ over $$\mathbb{Z}_2^n$$.

Gopalan, Parikshit, Salil Vadhan, and Yuan Zhou. “Locally testable codes and Cayley graphs.” In In Moni Naor, editor, Innovations in Theoretical Computer Science (ITCS ‘14), 81-92. ACM, 2014. Publisher's VersionAbstract

Version History: Full version posted as https://arxiv.org/pdf/1308.5158.pdf.

We give two new characterizations of ($$\mathbb{F}_2$$-linear) locally testable error-correcting codes in terms of Cayley graphs over $$\mathbb{F}^h_2$$ :

1. A locally testable code is equivalent to a Cayley graph over $$\mathbb{F}^h_2$$ whose set of generators is significantly larger than $$h$$ and has no short linear dependencies, but yields a shortest-path metric that embeds into $$\ell_1$$ with constant distortion. This extends and gives a converse to a result of Khot and Naor (2006), which showed that codes with large dual distance imply Cayley graphs that have no low-distortion embeddings into $$\ell_1$$.

2. A locally testable code is equivalent to a Cayley graph over $$\mathbb{F}^h_2$$ that has significantly more than $$h$$ eigenvalues near 1, which have no short linear dependencies among them and which “explain” all of the large eigenvalues. This extends and gives a converse to a recent construction of Barak et al. (2012), which showed that locally testable codes imply Cayley graphs that are small-set expanders but have many large eigenvalues.

Chung, Kai-Min, Michael Mitzenmacher, and Salil P. Vadhan. “Why simple hash functions work: Exploiting the entropy in a data stream.” Theory of Computing 9 (2013): 897-945. Publisher's VersionAbstract

Version History: Merge of conference papers from SODA ‘08 (with the same title) and RANDOM ‘08 (entitled “Tight Bounds for Hashing Block Sources”).

Hashing is fundamental to many algorithms and data structures widely used in practice. For the theoretical analysis of hashing, there have been two main approaches. First, one can assume that the hash function is truly random, mapping each data item independently and uniformly to the range. This idealized model is unrealistic because a truly random hash function requires an exponential number of bits (in the length of a data item) to describe. Alternatively, one can provide rigorous bounds on performance when explicit families of hash functions are used, such as 2-universal or $$O$$(1)-wise independent families. For such families, performance guarantees are often noticeably weaker than for ideal hashing.

In practice, however, it is commonly observed that simple hash functions, including 2-universal hash functions, perform as predicted by the idealized analysis for truly random hash functions. In this paper, we try to explain this phenomenon. We demonstrate that the strong performance of universal hash functions in practice can arise naturally from a combination of the randomness of the hash function and the data. Specifically, following the large body of literature on random sources and randomness extraction, we model the data as coming from a “block source,” whereby each new data item has some “entropy” given the previous ones. As long as the Rényi entropy per data item is sufficiently large, it turns out that the performance when choosing a hash function from a 2-universal family is essentially the same as for a truly random hash function. We describe results for several sample applications, including linear probing, chained hashing, balanced allocations, and Bloom filters.

Towards developing our results, we prove tight bounds for hashing block sources, determining the entropy required per block for the distribution of hashed values to be close to uniformly distributed.