Biography: Weijie Su (Chinese: 苏炜杰) is an Assistant Professor in the Department of Statistics and Data Science of The Wharton School and the Department of Computer and Information Science, at the University of Pennsylvania. He is a co-director of Penn Research in Machine Learning. Prior to joining Penn, he received his Ph.D. in statistics from Stanford University in 2016 and his bachelor's degree in mathematics from Peking University in 2011. His research interests span privacy-preserving data analysis, optimization, high-dimensional statistics, and deep learning theory. He is a recipient of the Stanford Theodore Anderson Dissertation Award in 2016, an NSF CAREER Award in 2019, and an Alfred Sloan Research Fellowship in 2020.
Title: Gaussian Differential Privacy
Abstract: Privacy-preserving data analysis has been put on a firm mathematical foundation since the introduction of differential privacy (DP) in 2006. This privacy definition, however, has some well-known weaknesses: notably, it does not tightly handle composition. In this talk, we propose a relaxation of DP that we term "f-DP", which has a number of appealing properties and avoids some of the difficulties associated with prior relaxations. This relaxation allows for lossless reasoning about composition and post-processing, and notably, a direct way to analyze privacy amplification by subsampling. We define a canonical single-parameter family of definitions within our class that is termed "Gaussian Differential Privacy", based on hypothesis testing of two shifted normal distributions. We prove that this family is focal to f-DP by introducing a central limit theorem, which shows that the privacy guarantees of any hypothesis-testing based definition of privacy converge to Gaussian differential privacy in the limit under composition. We also demonstrate a central limit theorem phenomenon for high-dimensional query answering, which gives rise to an uncertainty principle style result showing that for any mechanism, the product of its privacy guarantee and estimation loss is lower bounded by the dimension. Finally, we demonstrate the use of the tools we develop by giving an improved analysis of the privacy guarantees of noisy stochastic gradient descent. This is based on joint work with Jinshuo Dong, Aaron Roth, Zhiqi Bu, Qi Long, and Linjun Zhang.
Key words: Differential privacy, hypothesis testing, Neyman-Pearson lemma, composition, subsampling, deep learning.