Synthetic data in the clinical laboratory : methods, applications, and future prospects

dc.contributor.authorPillay, Tahir S.
dc.contributor.authorVan Deventer, Barbara Stroh
dc.contributor.authorGwiliza, Siphokazi
dc.contributor.authorSubramoney, Evette L.
dc.contributor.authorVan Niekerk, Chantal
dc.contributor.emailtahir.pillay@up.ac.za
dc.date.accessioned2026-03-12T11:28:47Z
dc.date.available2026-03-12T11:28:47Z
dc.date.issued2026-04
dc.descriptionDATA AVAILABILITY : No data was used for the research described in the article.
dc.description.abstractClinical laboratories face stringent privacy constraints, limited datasets for rare conditions, and rising demands to validate AI algorithms and workflows safely. Synthetic data—artificially generated data that preserve the statistical characteristics of real clinical data without exposing patient identities—has emerged as a powerful tool to address these challenges. This review provides a comprehensive overview of synthetic data in the context of laboratory medicine. We begin by defining synthetic data and describing the main generation methods, from rule-based simulations to modern generative models (including generative adversarial networks, variational autoencoders, and diffusion models) with examples of their use in healthcare. We then delve into key applications in the clinical laboratory: quality control and method validation, education and training, machine learning development, test utilization and workflow simulation, and external quality assessment. Advantages of synthetic data—such as enhanced privacy, scalability, flexibility in simulating rare events, and cost-effectiveness—are discussed with illustrative case studies. We also examine challenges and limitations, including concerns about data fidelity, bias amplification, risks of model overfitting or re-identification attacks, and the cautious stance of regulators that still require real patient data for approvals. Finally, we outline future directions for synthetic data in laboratory medicine, from hybrid real–synthetic datasets and privacy-enhancing techniques to evolving regulatory frameworks and the potential to democratize data access globally. While synthetic data cannot entirely replace real clinical data—especially for regulatory validation—it can significantly augment what laboratories can design, test, and achieve, provided it is used with careful validation and ethical safeguards. HIGHLIGHTS • Synthetic laboratory data enable safer sharing for method and algorithm development. • Three approaches: simulation, probabilistic models, and deep generative models. • Use cases include middleware testing, rare results, and competency training. • Governance needs privacy risk review, documentation, and drift monitoring.
dc.description.departmentChemical Pathology
dc.description.librarianhj2026
dc.description.sdgSDG-03: Good health and well-being
dc.description.urihttps://www.elsevier.com/locate/cca
dc.identifier.citationPillay, T.S., Van Deventer, B.S., Gwiliza, S., Subramoney, E.L. & Van Niekerk, C. 2026, 'Synthetic data in the clinical laboratory : methods, applications, and future prospects', Clinica Chimica Acta, vol. 585, art. 120878, pp. 1-13, doi : 10.1016/j.cca.2026.120878.
dc.identifier.issn0009-8981 (print)
dc.identifier.issn1873-3492 (online)
dc.identifier.other10.1016/j.cca.2026.120878
dc.identifier.urihttp://hdl.handle.net/2263/108932
dc.language.isoen
dc.publisherElsevier
dc.rights© 2026 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
dc.subjectSynthetic data
dc.subjectMethod and algorithm development
dc.subjectSimulation
dc.subjectProbabilistic models
dc.subjectDeep generative models
dc.subjectGovernance
dc.subjectHealthcare
dc.titleSynthetic data in the clinical laboratory : methods, applications, and future prospects
dc.typeArticle

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Pillay_Synthetic_2026.pdf
Size:
1.12 MB
Format:
Adobe Portable Document Format
Description:
Article

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: