# Publications

2019
Vadhan, Salil. “Computational entropy.” In Providing Sound Foundations for Cryptography: On the Work of Shafi Goldwasser and Silvio Micali (Oded Godreich, Ed.), 693-726. ACM, 2019.
Ahmadinejad, AmirMahdi, Jonathan Kelner, Jack Murtagh, John Peebles, Aaron Sidford, and Salil Vadhan. “High-precision estimation of random walks in small space.” arXiv: 1912.04525 [cs.CC], 2019 (2019). ArXiv VersionAbstract
In this paper, we provide a deterministic $$\tilde{O}(\log N)$$-space algorithm for estimating the random walk probabilities on Eulerian directed graphs (and thus also undirected graphs) to within inverse polynomial additive error $$(ϵ = 1/\mathrm{poly}(N))$$where $$N$$ is the length of the input. Previously, this problem was known to be solvable by a randomized algorithm using space $$O (\log N)$$ (Aleliunas et al., FOCS 79) and by a deterministic algorithm using space $$O (\log^{3/2} N)$$ (Saks and Zhou, FOCS 95 and JCSS 99), both of which held for arbitrary directed graphs but had not been improved even for undirected graphs. We also give improvements on the space complexity of both of these previous algorithms for non-Eulerian directed graphs when the error is negligible $$(ϵ=1/N^{ω(1)})$$, generalizing what Hoza and Zuckerman (FOCS 18) recently showed for the special case of distinguishing whether a random walk probability is 0 or greater than ϵ.
We achieve these results by giving new reductions between powering Eulerian random-walk matrices and inverting Eulerian Laplacian matrices, providing a new notion of spectral approximation for Eulerian graphs that is preserved under powering, and giving the first deterministic $$\tilde{O}(\log N)$$-space algorithm for inverting Eulerian Laplacian matrices. The latter algorithm builds on the work of Murtagh et al. (FOCS 17) that gave a deterministic $$\tilde{O}(\log N)$$-space algorithm for inverting undirected Laplacian matrices, and the work of Cohen et al. (FOCS 19) that gave a randomized $$\tilde{O} (N)$$-time algorithm for inverting Eulerian Laplacian matrices. A running theme throughout these contributions is an analysis of "cycle-lifted graphs," where we take a graph and "lift" it to a new graph whose adjacency matrix is the tensor product of the original adjacency matrix and a directed cycle (or variants of one).
Murtagh, Jack, Omer Reingold, Aaron Sidford, and Salil Vadhan. “Deterministic Approximation of Random Walks in Small Space.” In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019), Dimitris Achlioptas and László A. Végh (Eds.). Vol. 145. Cambridge, Massachusetts (MIT) : Leibniz International Proceedings in Informatics (LIPIcs), 2019. Publisher's VersionAbstract
Version History: v1, 15 Mar. 2019: https://arxiv.org/abs/1903.06361v1
v2 in ArXiv, 25 Nov. 2019: https://arxiv.org/abs/1903.06361v2

Publisher's Version (APPROX-RANDOM 2019), 20 Sep 2019:

We give a deterministic, nearly logarithmic-space algorithm that given an undirected graph $$G$$, a positive integer $$r$$, and a set $$S$$ of vertices, approximates the conductance of $$S$$ in the $$r$$-step random walk on $$G$$ to within a factor of $$1+ϵ$$, where $$ϵ > 0$$ is an arbitrarily small constant. More generally, our algorithm computes an $$ϵ$$-spectral approximation to the normalized Laplacian of the $$r$$-step walk. Our algorithm combines the derandomized square graph operation (Rozenman and Vadhan, 2005), which we recently used for solving Laplacian systems in nearly logarithmic space (Murtagh, Reingold, Sidford, and Vadhan, 2017), with ideas from (Cheng, Cheng, Liu, Peng, and Teng, 2015), which gave an algorithm that is time-efficient (while ours is space-efficient) and randomized (while ours is deterministic) for the case of even $$r$$ (while ours works for all $$r$$). Along the way, we provide some new results that generalize technical machinery and yield improvements over previous work. First, we obtain a nearly linear-time randomized algorithm for computing a spectral approximation to the normalized Laplacian for odd $$r$$. Second, we define and analyze a generalization of the derandomized square for irregular graphs and for sparsifying the product of two distinct graphs. As part of this generalization, we also give a strongly explicit construction of expander graphs of every size.

Agrawal, Rohit, Yi-Hsiu Chen, Thibaut Horel, and Salil Vadhan. “Unifying computational entropies via Kullback-Leibler divergence.” In Advances in Cryptology: CRYPTO 2019, A. Boldyreva and D., Micciancio, (Eds), 11693:831-858. Springer Verlag, Lecture Notes in Computer Science, 2019. Publisher's VersionAbstract
Version History:
arXiv, first posted Feb 2019, most recently updated Aug 2019: https://arxiv.org/abs/1902.11202

We introduce KL-hardness, a new notion of hardness for search problems which on the one hand is satisfied by all one-way functions and on the other hand implies both next-block pseudoentropy and inaccessible-entropy, two forms of computational entropy used in recent constructions of pseudorandom generators and statistically hiding commitment schemes, respectively. Thus, KL-hardness unifies the latter two notions of computational entropy and sheds light on the apparent "duality" between them. Additionally, it yields a more modular and illuminating proof that one-way functions imply next-block inaccessible entropy, similar in structure to the proof that one-way functions imply next-block pseudoentropy (Vadhan and Zheng, STOC '12).

2018
Raghunathan, Ananth, Gil Segev, and Salil P. Vadhan. “Deterministic public-key encryption for adaptively-chosen plaintext distributions.” Journal of Cryptology 31, no. 4 (2018): 1012-1063. Publisher's VersionAbstract

Version History: Preliminary versions in EUROCRYPT ‘13 and Cryptology ePrint report 2013/125.

Bellare, Boldyreva, and O’Neill (CRYPTO ’07) initiated the study of deterministic public-key encryption as an alternative in scenarios where randomized encryption has inherent drawbacks. The resulting line of research has so far guaranteed security only for adversarially-chosen plaintext distributions that are independent of the public key used by the scheme. In most scenarios, however, it is typically not realistic to assume that adversaries do not take the public key into account when attacking a scheme.

We show that it is possible to guarantee meaningful security even for plaintext distributions that depend on the public key. We extend the previously proposed notions of security, allowing adversaries to adaptively choose plaintext distributions after seeing the public key, in an interactive manner. The only restrictions we make are that: (1) plaintext distributions are unpredictable (as is essential in deterministic public-key encryption), and (2) the number of plaintext distributions from which each adversary is allowed to adaptively choose is upper bounded by $$2^p$$, where $$p$$ can be any predetermined polynomial in the security parameter. For example, with $$p=0$$ we capture plaintext distributions that are independent of the public key, and with $$p=0(s \log s)$$ we capture, in particular, all plaintext distributions that are samplable by circuits of size $$s$$.

Within our framework we present both constructions in the random-oracle model based on any public-key encryption scheme, and constructions in the standard model based on lossy trapdoor functions (thus, based on a variety of number-theoretic assumptions). Previously known constructions heavily relied on the independence between the plaintext distributions and the public key for the purposes of randomness extraction. In our setting, however, randomness extraction becomes significantly more challenging once the plaintext distributions and the public key are no longer independent. Our approach is inspired by research on randomness extraction from seed-dependent distributions. Underlying our approach is a new generalization of a method for such randomness extraction, originally introduced by Trevisan and Vadhan (FOCS ’00) and Dodis (PhD Thesis, MIT, ’00).

Bun, Mark, Jonathan Ullman, and Salil Vadhan. “Fingerprinting codes and the price of approximate differential privacy.” SIAM Journal on Computing, Special Issue on STOC '14 47, no. 5 (2018): 1888-1938. Publisher's VersionAbstract

Version HistorySpecial Issue on STOC ‘14. Preliminary versions in STOC ‘14 and arXiv:1311.3158 [cs.CR].

We show new information-theoretic lower bounds on the sample complexity of (ε, δ)- differentially private algorithms that accurately answer large sets of counting queries. A counting query on a database $$D ∈ (\{0, 1\}^d)^n$$ has the form “What fraction of the individual records in the database satisfy the property $$q$$?” We show that in order to answer an arbitrary set $$Q$$ of $$\gg d/ \alpha^2$$ counting queries on $$D$$ to within error $$±α$$ it is necessary that $$n ≥ \tilde{Ω}(\sqrt{d} \log |Q|/α^2ε)$$. This bound is optimal up to polylogarithmic factors, as demonstrated by the private multiplicative weights algorithm (Hardt and Rothblum, FOCS’10). In particular, our lower bound is the first to show that the sample complexity required for accuracy and (ε, δ)-differential privacy is asymptotically larger than what is required merely for accuracy, which is $$O(\log |Q|/α^2 )$$. In addition, we show that our lower bound holds for the specific case of $$k$$-way marginal queries (where $$|Q| = 2^k \binom{d}{k}$$ ) when $$\alpha$$ is not too small compared to d (e.g., when $$\alpha$$ is any fixed constant). Our results rely on the existence of short fingerprinting codes (Boneh and Shaw, CRYPTO’95; Tardos, STOC’03), which we show are closely connected to the sample complexity of differentially private data release. We also give a new method for combining certain types of sample-complexity lower bounds into stronger lower bounds.

Murtagh, Jack, and Salil Vadhan. “The complexity of computing the optimal composition of differential privacy.” Theory of Computing 14 (2018): 1-35. Publisher's VersionAbstract

Version History: Full version posted on CoRR, abs/1507.03113, July 2015Additional version published in Proceedings of the 13th IACR Theory of Cryptography Conference (TCC ‘16-A)

In the study of differential privacy, composition theorems (starting with the original paper of Dwork, McSherry, Nissim, and Smith (TCC’06)) bound the degradation of privacy when composing several differentially private algorithms. Kairouz, Oh, and Viswanath (ICML’15) showed how to compute the optimal bound for composing $$k$$ arbitrary ($$\epsilon$$,$$\delta$$)- differentially private algorithms. We characterize the optimal composition for the more general case of $$k$$ arbitrary ($$\epsilon_1$$ , $$\delta_1$$ ), . . . , ($$\epsilon_k$$ , $$\delta_k$$ )-differentially private algorithms where the privacy parameters may differ for each algorithm in the composition. We show that computing the optimal composition in general is $$\#$$P-complete. Since computing optimal composition exactly is infeasible (unless FP$$=$$$$\#$$P), we give an approximation algorithm that computes the composition to arbitrary accuracy in polynomial time. The algorithm is a modification of Dyer’s dynamic programming approach to approximately counting solutions to knapsack problems (STOC’03).

Karwa, Vishesh, and Salil Vadhan. “Finite sample differentially private confidence intervals.” In Anna R. Karlin, editor, 9th Innovations in Theoretical Computer Science Conference (ITCS 2018), volume 94 of Leibniz International Proceedings in Informatics (LIPIcs), 44:1-44:9. Dagstuhl, Germany, 2018. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik. ITCS, 2018. Publisher's VersionAbstract

Version History: Also presented at TPDP 2017. Preliminary version posted as arXiv:1711.03908 [cs.CR].

We study the problem of estimating finite sample confidence intervals of the mean of a nor- mal population under the constraint of differential privacy. We consider both the known and unknown variance cases and construct differentially private algorithms to estimate confidence in- tervals. Crucially, our algorithms guarantee a finite sample coverage, as opposed to an asymptotic coverage. Unlike most previous differentially private algorithms, we do not require the domain of the samples to be bounded. We also prove lower bounds on the expected size of any differentially private confidence set showing that our the parameters are optimal up to polylogarithmic factors.

Balcer, Victor, and Salil Vadhan. “Differential privacy on finite computers.” In Anna R. Karlin, editor, 9th Innovations in Theoretical Computer Science Conference (ITCS 2018), volume 94 of Leibniz International Proceedings in Informatics (LIPIcs), 43:1-43:21. Dagstuhl, Germany, 2018. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik. ITCS, 2018. Publisher's VersionAbstract

Version History: Also presented at TPDP 2017. Invited to J. Privacy & Condentiality Special Issue on TPDP 2017. Preliminary version posted as arXiv:1709.05396 [cs.DS].

We consider the problem of designing and analyzing differentially private algorithms that can be implemented on discrete models of computation in strict polynomial time, motivated by known attacks on floating point implementations of real-arithmetic differentially private algorithms (Mironov, CCS 2012) and the potential for timing attacks on expected polynomial- time algorithms. As a case study, we examine the basic problem of approximating the histogram of a categorical dataset over a possibly large data universe $$\chi$$. The classic Laplace Mechanism (Dwork, McSherry, Nissim, Smith, TCC 2006 and J. Privacy & Confidentiality 2017) does not satisfy our requirements, as it is based on real arithmetic, and natural discrete analogues, such as the Geometric Mechanism (Ghosh, Roughgarden, Sundarajan, STOC 2009 and SICOMP 2012), take time at least linear in $$|\chi|$$, which can be exponential in the bit length of the input.

In this paper, we provide strict polynomial-time discrete algorithms for approximate his- tograms whose simultaneous accuracy (the maximum error over all bins) matches that of the Laplace Mechanism up to constant factors, while retaining the same (pure) differential privacy guarantee. One of our algorithms produces a sparse histogram as output. Its “per-bin accuracy” (the error on individual bins) is worse than that of the Laplace Mechanism by a factor of $$\log |\chi|$$, but we prove a lower bound showing that this is necessary for any algorithm that produces a sparse histogram. A second algorithm avoids this lower bound, and matches the per-bin accuracy of the Laplace Mechanism, by producing a compact and efficiently computable representation of a dense histogram; it is based on an $$(n + 1)$$-wise independent implementation of an appropriately clamped version of the Discrete Geometric Mechanism.

Murtagh, Jack, Kathryn Taylor, George Kellaris, and Salil P. Vadhan. “Usable differential privacy: A case study with PSI.” arXiv, 2018, 1809.04103 [cs.CR]. ArXiv VersionAbstract

Version History: v1, 11 September 2018 https://arxiv.org/abs/1809.04103

Differential privacy is a promising framework for addressing the privacy concerns in sharing sensitive datasets for others to analyze. However differential privacy is a highly technical area and current deployments often require experts to write code, tune parameters, and optimize the trade-off between the privacy and accuracy of statistical releases. For differential privacy to achieve its potential for wide impact, it is important to design usable systems that enable differential privacy to be used by ordinary data owners and analysts. PSI is a tool that was designed for this purpose, allowing researchers to release useful differentially private statistical information about their datasets without being experts in computer science, statistics, or privacy. We conducted a thorough usability study of PSI to test whether it accomplishes its goal of usability by non-experts. The usability test illuminated which features of PSI are most user-friendly and prompted us to improve aspects of the tool that caused confusion. The test also highlighted some general principles and lessons for designing usable systems for differential privacy, which we discuss in depth.

Chen, Yi-Hsiu, Mika Goos, Salil P. Vadhan, and Jiapeng Zhang. “A tight lower bound for entropy flattening.” In 33rd Computational Complexity Conference (CCC 2018), 102:23:21-23:28. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik: Leibniz International Proceedings in Informatics (LIPIcs), 2018. Publisher's VersionAbstract

Version History: Preliminary version posted as ECCC TR18-119.

We study entropy flattening: Given a circuit $$C_X$$ implicitly describing an n-bit source $$X$$ (namely, $$X$$ is the output of $$C_X$$  on a uniform random input), construct another circuit $$C_Y$$ describing a source $$Y$$ such that (1) source $$Y$$ is nearly flat (uniform on its support), and (2) the Shannon entropy of $$Y$$ is monotonically related to that of $$X$$. The standard solution is to have $$C_Y$$ evaluate $$C_X$$ altogether $$\Theta(n^2)$$ times on independent inputs and concatenate the results (correctness follows from the asymptotic equipartition property). In this paper, we show that this is optimal among black-box constructions: Any circuit $$C_Y$$ for entropy flattening that repeatedly queries $$C_X$$ as an oracle requires $$\Omega(n^2)$$queries.

Entropy flattening is a component used in the constructions of pseudorandom generators and other cryptographic primitives from one-way functions [12, 22, 13, 6, 11, 10, 7, 24]. It is also used in reductions between problems complete for statistical zero-knowledge [19, 23, 4, 25]. The $$\Theta(n^2)$$ query complexity is often the main efficiency bottleneck. Our lower bound can be viewed as a step towards proving that the current best construction of pseudorandom generator from arbitrary one-way functions by Vadhan and Zheng (STOC 2012) has optimal efficiency.

Wood, Alexandra, Micah Altman, Aaron Bembenek, Mark Bun, Marco Gaboardi, James Honaker, Kobbi Nissim, David R. OBrien, Thomas Steinke, and Salil Vadhan. “Differential privacy: A primer for a non-technical audience.Vanderbilt Journal of Entertainment & Technology Law 21, no. 1 (2018): 209-275. Publisher's VersionAbstract

Version History: Preliminary version workshopped at PLSC 2017.

Differential privacy is a formal mathematical framework for quantifying and managing privacy risks. It provides provable privacy protection against a wide range of potential attacks, including those currently unforeseen. Differential privacy is primarily studied in the context of the collection, analysis, and release of aggregate statistics.

These range from simple statistical estimations, such as averages, to machine learning. Tools for differentially private analysis are now in early stages of implementation and use across a variety of academic, industry, and government settings. Interest in the concept is growing among potential users of the tools, as well as within legal and policy communities, as it holds promise as a potential approach to satisfying legal requirements for privacy protection when handling personal information. In particular, differential privacy may be seen as a technical solution for analyzing and sharing data while protecting the privacy of individuals in accordance with existing legal or policy requirements for de-identification or disclosure limitation.

This primer seeks to introduce the concept of differential privacy and its privacy implications to non-technical audiences. It provides a simplified and informal, but mathematically accurate, description of differential privacy. Using intuitive illustrations and limited mathematical formalism, it discusses the definition of differential privacy, how differential privacy addresses privacy risks, how differentially private analyses are constructed, and how such analyses can be used in practice. A series of illustrations is used to show how practitioners and policymakers can conceptualize the guarantees provided by differential privacy. These illustrations are also used to explain related concepts, such as composition (the accumulation of risk across multiple analyses), privacy loss parameters, and privacy budgets.

This primer aims to provide a foundation that can guide future decisions when analyzing and sharing statistical data about individuals, informing individuals about the privacy protection they will be afforded, and designing policies and regulations for robust privacy protection.

2017
Steinke, Thomas, Salil Vadhan, and Andrew Wan. “Pseudorandomness and Fourier growth bounds for width 3 branching programs.” Theory of Computing – Special Issue on APPROX-RANDOM 2014 13, no. 12 (2017): 1-50. Publisher's VersionAbstract

Version History: a conference version of this paper appeared in the Proceedings of the 18th International Workshop on Randomization and Computation (RANDOM'14). Full version posted as ECCC TR14-076 and arXiv:1405.7028 [cs.CC].

We present an explicit pseudorandom generator for oblivious, read-once, width-3 branching programs, which can read their input bits in any order. The generator has seed length $$Õ(\log^3 n)$$.The previously best known seed length for this model is $$n^{1/2+o(1)}$$ due to Impagliazzo, Meka, and Zuckerman (FOCS ’12). Our work generalizes a recent result of Reingold, Steinke, and Vadhan (RANDOM ’13) for permutation branching programs. The main technical novelty underlying our generator is a new bound on the Fourier growth of width-3, oblivious, read-once branching programs. Specifically, we show that for any $$f : \{0, 1\}^n → \{0, 1\}$$ computed by such a branching program, and $$k ∈ [n]$$,

$$\displaystyle\sum_{s⊆[n]:|s|=k} \big| \hat{f}[s] \big | ≤n^2 ·(O(\log n))^k$$,

where $$\hat{f}[s] = \mathbb{E}_U [f[U] \cdot (-1)^{s \cdot U}]$$ is the standard Fourier transform over $$\mathbb{Z}^n_2$$. The base $$O(\log n)$$ of the Fourier growth is tight up to a factor of $$\log \log n$$.

Nissim, Kobbi, Aaron Bembenek, Alexandra Wood, Mark Bun, Marco Gaboardi, Urs Gasser, David O'Brien, Thomas Steinke, and Salil Vadhan. “Bridging the gap between computer science and legal approaches to privacy.” Harvard Journal of Law & Technology 31, no. 2 (2017). Publisher's VersionAbstract

Version History: Workshopped at PLSC (Privacy Law Scholars Conference) ‘16.

The analysis and release of statistical data about individuals and groups of individuals carries inherent privacy risks, and these risks have been conceptualized in different ways within the fields of law and computer science. For instance, many information privacy laws adopt notions of privacy risk that are sector- or context-specific, such as in the case of laws that protect from disclosure certain types of information contained within health, educational, or financial records. In addition, many privacy laws refer to specific techniques, such as deidentification, that are designed to address a subset of possible attacks on privacy. In doing so, many legal standards for privacy protection rely on individual organizations to make case-by-case determinations regarding concepts such as the identifiability of the types of information they hold. These regulatory approaches are intended to be flexible, allowing organizations to (1) implement a variety of specific privacy measures that are appropriate given their varying institutional policies and needs, (2) adapt to evolving best practices, and (3) address a range of privacy-related harms. However, in the absence of clear thresholds and detailed guidance on making case-specific determinations, flexibility in the interpretation and application of such standards also creates uncertainty for practitioners and often results in ad hoc, heuristic processes. This uncertainty may pose a barrier to the adoption of new technologies that depend on unambiguous privacy requirements. It can also lead organizations to implement measures that fall short of protecting against the full range of data privacy risks.

Vadhan., Salil P.On learning vs. refutation.30th Conference on Learning Theory (COLT 17), 2017, 65, 1835-1848. Publisher's VersionAbstract
Building on the work of Daniely et al. (STOC 2014, COLT 2016), we study the connection between computationally efficient PAC learning and refutation of constraint satisfaction problems. Specifically, we prove that for every concept class $$P$$ , PAC-learning $$P$$ is \em polynomially equivalent to “random-right-hand-side-refuting” (“RRHS-refuting”) a dual class $$P ^∗$$ , where RRHS-refutation of a class $$Q$$ refers to refuting systems of equations where the constraints are (worst-case) functions from the class $$Q$$ but the right-hand-sides of the equations are uniform and independent random bits. The reduction from refutation to PAC learning can be viewed as an abstraction of (part of) the work of Daniely, Linial, and Shalev-Schwartz (STOC 2014). The converse, however, is new, and is based on a combination of techniques from pseudorandomness (Yao ‘82) with boosting (Schapire ‘90). In addition, we show that PAC-learning the class of $$DNF$$ formulas is polynomially equivalent to PAC-learning its dual class $$DNF ^∗$$ , and thus PAC-learning $$DNF$$ is equivalent to RRHS-refutation of $$DNF$$ , suggesting an avenue to obtain stronger lower bounds for PAC-learning $$DNF$$ than the quasipolynomial lower bound that was obtained by Daniely and Shalev-Schwartz (COLT 2016) assuming the hardness of refuting $$k$$-SAT
Murtagh, Jack, Omer Reingold, Aaron Sidford, and Salil Vadhan. “Derandomization beyond connectivity: Undirected Laplacian systems in nearly logarithmic space.58th Annual IEEE Symposium on Foundations of Computer Science (FOCS 17), 2017. Publisher's VersionAbstract
Version History
ArXiv, 15 August 2017 https://arxiv.org/abs/1708.04634

We give a deterministic $$O ~ (logn)$$  -space algorithm for approximately solving linear systems given by Laplacians of undirected graphs, and consequently also approximating hitting times, commute times, and escape probabilities for undirected graphs. Previously, such systems were known to be solvable by randomized algorithms using $$O(logn)$$ space (Doron, Le Gall, and Ta-Shma, 2017) and hence by deterministic algorithms using$$O(log 3/2 n)$$  space (Saks and Zhou, FOCS 1995 and JCSS 1999).
Our algorithm combines ideas from time-efficient Laplacian solvers (Spielman and Teng, STOC 04; Peng and Spielman, STOC 14) with ideas used to show that Undirected S-T Connectivity is in deterministic logspace (Reingold, STOC 05 and JACM 08; Rozenman and Vadhan, RANDOM `05).
Chen, Yi-Hsiu, Kai-Min Chung, Ching-Yi Lai, Salil P. Vadhan, and Xiaodi Wu.Computational notions of quantum min-entropy.” In Poster presention at QIP 2017 and oral presentation at QCrypt 2017, 2017. Publisher's VersionAbstract

Version History

ArXiv v1, 24 April 2017 https://arxiv.org/abs/1704.07309v1
ArXiv v2, 25 April 2017 https://arxiv.org/abs/1704.07309v2
ArXiv v3, 9 September 2017 https://arxiv.org/abs/1704.07309v3
ArXiv v4, 5 October 2017 https://arxiv.org/abs/1704.07309v4

We initiate the study of computational entropy in the quantum setting. We investigate to what extent the classical notions of computational entropy generalize to the quantum setting, and whether quantum analogues of classical theorems hold. Our main results are as follows. (1) The classical Leakage Chain Rule for pseudoentropy can be extended to the case that the leakage information is quantum (while the source remains classical). Specifically, if the source has pseudoentropy at least k  , then it has pseudoentropy at least k−ℓ  conditioned on an ℓ  -qubit leakage. (2) As an application of the Leakage Chain Rule, we construct the first quantum leakage-resilient stream-cipher in the bounded-quantum-storage model, assuming the existence of a quantum-secure pseudorandom generator. (3) We show that the general form of the classical Dense Model Theorem (interpreted as the equivalence between two definitions of pseudo-relative-min-entropy) does not extend to quantum states. Along the way, we develop quantum analogues of some classical techniques (e.g. the Leakage Simulation Lemma, which is proven by a Non-uniform Min-Max Theorem or Boosting). On the other hand, we also identify some classical techniques (e.g. Gap Amplification) that do not work in the quantum setting. Moreover, we introduce a variety of notions that combine quantum information and quantum complexity, and this raises several directions for future work.

2016
Chen, Yiling, Stephen Chong, Ian A. Kash, Tal Moran, and Salil P. Vadhan. “Truthful mechanisms for agents that value privacy.” ACM Transactions on Economics and Computation 4, no. 3 (2016). ArXiv VersionAbstract

Version History: Special issue on EC ‘13. Preliminary version at arXiv:1111.5472 [cs.GT] (Nov. 2011).

Recent work has constructed economic mechanisms that are both truthful and differentially private. In these mechanisms, privacy is treated separately from truthfulness; it is not incorporated in players’ utility functions (and doing so has been shown to lead to nontruthfulness in some cases). In this work, we propose a new, general way of modeling privacy in players’ utility functions. Specifically, we only assume that if an outcome $${o}$$ has the property that any report of player $${i}$$ would have led to $${o}$$ with approximately the same probability, then $${o}$$ has a small privacy cost to player $${i}$$. We give three mechanisms that are truthful with respect to our modeling of privacy: for an election between two candidates, for a discrete version of the facility location problem, and for a general social choice problem with discrete utilities (via a VCG-like mechanism). As the number $${n}$$ of players increases, the social welfare achieved by our mechanisms approaches optimal (as a fraction of $${n}$$).

Altman, Micah, Alexandra Wood, David R. O'Brien, Salil Vadhan, and Urs Gasser. “Towards a modern approach to a privacy-aware government data releases.” Berkeley Technology Law Journal 30, no. 3 (2016): 1967-2072. Publisher's VersionAbstract
Governments are under increasing pressure to publicly release collected data in order to promote transparency, accountability, and innovation. Because much of the data they release pertains to individuals, agencies rely on various standards and interventions to protect privacy interests while supporting a range of beneficial uses of the data. However, there are growing concerns among privacy scholars, policymakers, and the public that these approaches are incomplete, inconsistent, and difficult to navigate. To identify gaps in current practice, this Article reviews data released in response to freedom of information and Privacy Act requests, traditional public and vital records, official statistics, and e-government and open government initiatives. It finds that agencies lack formal guidance for implementing privacy interventions in specific cases. Most agencies address privacy by withholding or redacting records that contain directly or indirectly identifying information based on an ad hoc balancing of interests, and different government actors sometimes treat similar privacy risks vastly differently. These observations demonstrate the need for a more systematic approach to privacy analysis and also suggest a new way forward. In response to these concerns, this Article proposes a framework for a modern privacy analysis informed by recent advances in data privacy from disciplines such as computer science, statistics, and law. Modeled on an information security approach, this framework characterizes and distinguishes between privacy controls, threats, vulnerabilities, and utility. When developing a data release mechanism, policymakers should specify the desired data uses and expected benefits, examine each stage of the data lifecycle to identify privacy threats and vulnerabilities, and select controls for each lifecycle stage that are consistent with the uses, threats, and vulnerabilities at that stage. This Article sketches the contours of this analytical framework, populates selected portions of its contents, and illustrates how it can inform the selection of privacy controls by discussing its application to two real-world examples of government data releases.
Nissim, Kobbi, Uri Stemmer, and Salil Vadhan. “Locating a small cluster privately.” In Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS ‘16), 413-427. ACM, 2016. Publisher's VersionAbstract

Version HistoryFull version posted as arXiv:1604.05590 [cs.DS].

We present a new algorithm for locating a small cluster of points with differential privacy [Dwork, McSherry, Nissim, and Smith, 2006]. Our algorithm has implications to private data exploration, clustering, and removal of outliers. Furthermore, we use it to significantly relax the requirements of the sample and aggregate technique [Nissim, Raskhod- nikova, and Smith, 2007], which allows compiling of “off the shelf ” (non-private) analyses into analyses that preserve differential privacy.