跳到主要导航 跳到搜索 跳到主要内容

VertiMRF: Differentially Private Vertical Federated Data Synthesis

  • Xi'an Jiaotong University
  • Alibaba Group Holding Ltd.

科研成果: 书/报告/会议事项章节会议稿件同行评审

6 引用 (Scopus)

摘要

Data synthesis is a promising solution to share data for various downstream analytic tasks without exposing raw data. However, without a theoretical privacy guarantee, a synthetic dataset would still leak some sensitive information in raw data. As a countermeasure, differential privacy is widely adopted to safeguard data synthesis by strictly limiting the released information. This technique is advantageous yet presents significant challenges in the vertical federated setting, where data attributes are distributed among different data parties. The main challenge lies in maintaining privacy while efficiently and precisely reconstructing the correlation between attributes. In this paper, we propose a novel algorithm called VertiMRF, designed explicitly for generating synthetic data in the vertical setting and providing differential privacy protection for all information shared from data parties. We introduce techniques based on the Flajolet-Martin (FM) sketch for encoding local data satisfying differential privacy and estimating cross-party marginals. We provide theoretical privacy and utility proof for encoding in this multi-attribute data. Collecting the locally generated private Markov Random Field (MRF) and the sketches, a central server can reconstruct a global MRF, maintaining the most useful information. Two critical techniques introduced in our VertiMRF are dimension reduction and consistency enforcement, preventing the noise of FM sketch from overwhelming the information of attributes with large domain sizes when building the global MRF. These two techniques allow flexible and inconsistent binning strategies of local private MRF and the data sketching module, which can preserve information to the greatest extent. We conduct extensive experiments on four real-world datasets to evaluate the effectiveness of VertiMRF. End-to-end comparisons demonstrate the superiority of VertiMRF.

源语言英语
主期刊名KDD 2024 - Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
出版商Association for Computing Machinery
4431-4442
页数12
ISBN(电子版)9798400704901
DOI
出版状态已出版 - 24 8月 2024
活动30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2024 - Barcelona, 西班牙
期限: 25 8月 202429 8月 2024

出版系列

姓名Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
ISSN(印刷版)2154-817X

会议

会议30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2024
国家/地区西班牙
Barcelona
时期25/08/2429/08/24

学术指纹

探究 'VertiMRF: Differentially Private Vertical Federated Data Synthesis' 的科研主题。它们共同构成独一无二的指纹。

引用此