Differentially private hypothesis testing for linear regression

Citation:

Alabi, Daniel, and Salil Vadhan. “Differentially private hypothesis testing for linear regression.” Journal of Machine Learning Research 24, no. 361 (2023): 1-50.
ArXiv 2022.pdf2.13 MB
NeurIPS 2022.pdf564 KB
JMLR 2023.pdf1.65 MB

Abstract:

Version History: Preliminary versions in NeurIPS '22, posted as arXiv:2206.14449 and presented at TPDP ‘21 (poster), IMS ‘22 (oral), and SEA ‘22 (oral). (Previously published as "Hypothesis testing for differentially private linear regression".

 

Abstract:

In this work, we design differentially private hypothesis tests for the following problems in the general linear model: testing a linear relationship and testing for the presence of mixtures. The majority of our hypothesis tests are based on differentially private versions of the $F$-statistic for the general linear model framework, which are uniformly most powerful unbiased in the non-private setting. We also present other tests for these problems, one of which is based on the differentially private nonparametric tests of Couch, Kazan, Shi, Bray, and Groce (CCS 2019), which is especially suited for the small dataset regime. We show that the differentially private $F$-statistic converges to the asymptotic distribution of its non-private counterpart. As a corollary, the statistical power of the differentially private $F$-statistic converges to the statistical power of the non-private $F$-statistic. Through a suite of Monte Carlo based experiments, we show that our tests achieve desired significance levels and have a high power that approaches the power of the non-private tests as we increase sample sizes or the privacy-loss parameter. We also show when our tests outperform existing methods in the literature.

Publisher's Version

Last updated on 04/02/2024