Random paper
 

AIM high, stay private: Differentially private synthetic data enables public release of behavioral health information with high utility

M. Ghasemizade, J. Lovato, C. M. Danforth, P. S. Dodds, L. S. P. Bloomfield, M. Price, Team LEMURS, J. P. Near

JAMIA Open, 9, ooag066, 2026

Times cited: 0

Abstract:

\textbf{Objective:}\newline Sharing behavioral health and wearable data poses privacy challenges, as traditional de-identification remains vulnerable to re-identification. Differential privacy (DP) provides mathematical guarantees through a tunable privacy budget, $\varepsilon$⁠. This study evaluates the feasibility of generating and releasing DP synthetic behavioral health data with high analytical utility, identifying practical $\varepsilon$ values for public data sharing.\newline\newline \textbf{Materials and methods:}\newline We analyzed physiological data from wearable devices and self-reported data from Phase 1 of the Lived Experiences Measured Using Rings Study (LEMURS), which tracked sleep, stress, and well-being among first-year college students. Three DP synthetic data generators: AIM, MST, and PATECTGAN, were evaluated across privacy budgets ranging from $\varepsilon=1$ to 100. Utility was assessed using L1/L2 errors, correlation, regression, UMAP, and assessed vulnerability via privacy attacks.\newline\newline \textbf{Results}\newline AIM outperformed MST and PATECTGAN in preserving both statistical and analytical properties of the original data. For the Survey dataset, the lowest marginal errors occurred at $\varepsilon=5$ and 10. Correlation, regression, and UMAP analyses confirmed that AIM-generated data closely replicated original relationships at moderate $\varepsilon$ values.\newline\newline \textbf{Discussion}\newline Choice of privacy budget is still an open question, and it is task-agnostic and dataset-specific. Moderate privacy budgets (⁠$5 \le \varepsilon \le 10$) maintained key associations between physiological and psychological measures while ensuring privacy. AIM’s workload-aware design effectively allocated noise toward relevant features, enhancing performance.\newline\newline \textbf{Conclusion:}\newline A privacy budget of $\varepsilon = 5$ offers a practical balance between data utility and participant privacy for LEMURS behavioral health data sharing.
  • This is the default HTML.
  • You can replace it with your own.
  • Include your own code without the HTML, Head, or Body tags.

BibTeX:

@article{ghasemizade2026a,
  author =	 {Ghasemizade, Mohsen and Lovato, Juniper and
                  Danforth, Christopher M. and Dodds, Peter Sheridan
                  and Bloomfield, Laura S. P. and Price, Matthew and
                  LEMURS, T. and Near, Joseph P.},
  title =	 {Aim high, stay private: Differentially private
                  synthetic data enables public release of behavioral
                  health information with high utility},
  journal =	 {JAMIA Open},
  year =	 {2026},
  volume =	 {9},
  number =	 {2},
  pages =	 {ooag066},
  note =	 {arXiv version: \url{https://arxiv.org/abs/2507.02971}},
}

 

Random paper