Differentially Private Synthetic Data without Training

835

10.7

Microsoft Research354 тыс

Следующее

325 дней – 42944:38

How to Compress Garbled Circuit Input Labels, Efficiently

Популярные

28 дней – 6941:46:30

Rick Rashid & Founding Microsoft Research

282 дня – 38449:07

1st Annual Fusion Summit: Welcome, Opening Remarks, & Distinguished Keynote Lecture

Опубликовано 24 марта 2025, 15:48

Speakers: Zinan Lin
Host: Kim Laine

Generating differentially private (DP) synthetic data that closely resembles original data while preserving user privacy is a scalable solution to address privacy concerns in today's data-driven world.

In this talk, I will introduce Private Evolution (PE), a new training-free framework for DP synthetic data generation, which contrasts with existing approaches that rely on training DP generative models. PE treats foundation models as blackboxes and only utilizes their inference APIs. We demonstrate that across both images and text, PE: (1) matches or even outperforms prior state-of-the-art (SoTA) methods in the fidelity-privacy trade-off without any model training; (2) enables the use of advanced open-source models (e.g., Mixtral) and API-based models (e.g., GPT-3.5), where previous SoTA approaches are inapplicable; and (3) is more computationally efficient than prior SoTA methods.

Additionally, I will discuss recent extensions of PE--both from our work and contributions from the broader community--including the integration of data simulators, fusion of knowledge from multiple models for DP data synthesis, and applications in federated learning. We hope that PE unlocks the full potential of foundation models in privacy-preserving machine learning and accelerates the adoption of DP synthetic data across industries.

Свежие видео