Department of Computational and Data Sciences
Department Seminar
Speaker : Prof. Krishna Pillutla, Assistant Professor, IITM
Title : “Toward realizing user-level differential privacy at scale”
Date & Time : February 24, 2025 (Monday), 11:00 AM
Venue : # 102, CDS Seminar Hall
ABSTRACT
There is a growing realization that in-domain user data is crucial to unlocking the full potential of AI models. However, use of such data comes at the cost of increased risk of leaking information and compromising the privacy of individual users. In this talk, I’ll present some building blocks required to realize user-level differential privacy (DP) to protect the privacy of all the (possibly related) examples contributed by any individual user. We focus on some fundamental questions: Why do we even need user-level DP? How do we attain it and audit it?
We start by showing that an adversary can reliably infer whether a user’s data was used in training. Crucially, this is possible using only a few fresh samples of the user’s data that were not used for training. Heuristic mitigation strategies have limited success, motivating the need for the strong guarantees provided user-level DP.
In the next part of the talk, we look at private learning algorithms, focusing on a class of noisy stochastic gradient algorithms that inject temporally correlated noise. While these algorithms enjoy the state-of-the-art utility, they suffer from a quadratic runtime complexity. We improve the runtime complexity to nearly linear at no cost in the privacy-utility tradeoff both in theory (with near optimal error bounds) and in practice (with significant empirical improvements).
The final step of a practical (user-level) DP implementation is an empirical audit to verify the correctness of the claimed DP guarantee. We present a randomized auditing procedure that significantly reduces the number of training runs to give a high probability lower bound on the privacy leakage. Along the way, we introduce a benchmark of large-scale user-stratified datasets to enable investigations into user-level DP and federated learning at the scale of foundation models.
I’ll conclude with a set of future research directions and concrete applications. This is based on collaborative work with many and will touch upon results from the following publications: EMNLP (2024; Oral), FOCS (2024), ICLR (2024), NeurIPS (2023), NeurIPS D&B (2023), SaTML (2025)
BIO: Krishna Pillutla is an assistant professor at the Wadhwani School of Data Science and AI at IIT Madras in India. Previously, he has been a visiting researcher (postdoc) at Google Research in the Federated Learning team. He obtained his Ph.D. at the University of Washington where he was advised by Zaid Harchaoui and Sham Kakade. Before that, he received his M.S. from Carnegie Mellon University and B.Tech. from IIT Bombay.
Krishna’s research has been recognized by a NeurIPS outstanding paper award (2021), a JP Morgan Ph.D. fellowship (2019-20), and two American Statistical Association (ASA) Student Paper Award Honorable Mentions.
Host Faculty: Dr. Danish Pruthi
ALL ARE WELCOME