The practical potential of generating synthetic data for cardiovascular disease research.
Summary
Cardiovascular diseases (CVDs) remain a leading
global health issue, contributing to significant mortality and
reduced quality of life. While advancements in big data and
precision medicine have improved diagnostic and prognostic
tools, challenges persist due to high economic costs, data pri-
vacy regulations and dataset imbalances, particularly in under-
represented groups. Synthetic data, generated through techniques
like generative adversarial networks (GANs) and differential
privacy (DP), offer a promising solution. These methods allow the
creation of large, diverse and anonymized datasets that mimic
real-world data while ensuring patient privacy. Synthetic data can
address gender and class imbalances, enhance model training and
improve imaging quality in CVD research. However, limitations
remain regarding data quality, trust in synthetic outputs and
practical implementation. Collaborative efforts among clinicians,
researchers and policymakers are essential to realise the full
potential of synthetic data in overcoming current barriers to
CVD research. This work highlights both the opportunities and
challenges of using synthetic data, emphasizing its role as a future
tool to advance cardiovascular research