MediSyn: Text-Guided Diffusion Models for Broad Medical 2D and 3D Image Synthesis

Diffusion models have recently gained significant traction due to their ability to generate high-fidelity and diverse images and videos conditioned on text prompts. In medicine, this application promises to address the critical challenge of data scarcity, a consequence of barriers in data sharing, s...

Full description

Saved in:

Bibliographic Details
Published in	arXiv.org
Main Authors	Cho, Joseph, Zakka, Cyril, Kaur, Dhamanpreet, Shad, Rohan, Wightman, Ross, Chaudhari, Akshay, Hiesinger, William
Format	Paper
Language	English
Published	Ithaca Cornell University Library, arXiv.org 10.07.2024
Subjects	Accuracy Demographics Medical imaging Privacy Synthesis
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Diffusion models have recently gained significant traction due to their ability to generate high-fidelity and diverse images and videos conditioned on text prompts. In medicine, this application promises to address the critical challenge of data scarcity, a consequence of barriers in data sharing, stringent patient privacy regulations, and disparities in patient population and demographics. By generating realistic and varying medical 2D and 3D images, these models offer a rich, privacy-respecting resource for algorithmic training and research. To this end, we introduce MediSyn, a pair of instruction-tuned text-guided latent diffusion models with the ability to generate high-fidelity and diverse medical 2D and 3D images across specialties and modalities. Through established metrics, we show significant improvement in broad medical image and video synthesis guided by text prompts.
ISSN:	2331-8422