Simulants: Synthetic Clinical Trial Data via Subject-Level Privacy-Preserving Synthesis
Clinical trials capture high-quality data for millions of patients each year, yet these data are largely unavailable for research beyond the scope of any individual trial due to a combination of regulatory, intellectual property, and patient privacy barriers. Synthetic clinical trial data that captu...
Saved in:
Published in | AMIA ... Annual Symposium proceedings Vol. 2022; p. 231 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
United States
2022
|
Subjects | |
Online Access | Get full text |
ISSN | 1942-597X 1559-4076 |
Cover
Loading…
Abstract | Clinical trials capture high-quality data for millions of patients each year, yet these data are largely unavailable for research beyond the scope of any individual trial due to a combination of regulatory, intellectual property, and patient privacy barriers. Synthetic clinical trial data that captures the analytical properties of the source data, could provide significant value for research and drug development by making insights widely available while protecting the privacy of the participants. We present a method "Simulants" for generating research-grade synthetic clinical trial data from a real data source. We compared the fidelity and privacy preservation performance of Simulants to the state-of-the-art deep learning synthesizers and found that Simulants had superior performance when applied to clinical trial data as assessed both by established metrics and when considering critical clinical features. We also demonstrate how Simulants' privacy settings may be configured to conform to specific privacy policies governing data sharing. |
---|---|
AbstractList | Clinical trials capture high-quality data for millions of patients each year, yet these data are largely unavailable for research beyond the scope of any individual trial due to a combination of regulatory, intellectual property, and patient privacy barriers. Synthetic clinical trial data that captures the analytical properties of the source data, could provide significant value for research and drug development by making insights widely available while protecting the privacy of the participants. We present a method "Simulants" for generating research-grade synthetic clinical trial data from a real data source. We compared the fidelity and privacy preservation performance of Simulants to the state-of-the-art deep learning synthesizers and found that Simulants had superior performance when applied to clinical trial data as assessed both by established metrics and when considering critical clinical features. We also demonstrate how Simulants' privacy settings may be configured to conform to specific privacy policies governing data sharing. Clinical trials capture high-quality data for millions of patients each year, yet these data are largely unavailable for research beyond the scope of any individual trial due to a combination of regulatory, intellectual property, and patient privacy barriers. Synthetic clinical trial data that captures the analytical properties of the source data, could provide significant value for research and drug development by making insights widely available while protecting the privacy of the participants. We present a method "Simulants" for generating research-grade synthetic clinical trial data from a real data source. We compared the fidelity and privacy preservation performance of Simulants to the state-of-the-art deep learning synthesizers and found that Simulants had superior performance when applied to clinical trial data as assessed both by established metrics and when considering critical clinical features. We also demonstrate how Simulants' privacy settings may be configured to conform to specific privacy policies governing data sharing.Clinical trials capture high-quality data for millions of patients each year, yet these data are largely unavailable for research beyond the scope of any individual trial due to a combination of regulatory, intellectual property, and patient privacy barriers. Synthetic clinical trial data that captures the analytical properties of the source data, could provide significant value for research and drug development by making insights widely available while protecting the privacy of the participants. We present a method "Simulants" for generating research-grade synthetic clinical trial data from a real data source. We compared the fidelity and privacy preservation performance of Simulants to the state-of-the-art deep learning synthesizers and found that Simulants had superior performance when applied to clinical trial data as assessed both by established metrics and when considering critical clinical features. We also demonstrate how Simulants' privacy settings may be configured to conform to specific privacy policies governing data sharing. |
Author | Shafquat, Afrah Beigi, Mandis Mezey, Jason Aptekar, Jacob |
Author_xml | – sequence: 1 givenname: Mandis surname: Beigi fullname: Beigi, Mandis organization: Medidata, New York, NY, USA – sequence: 2 givenname: Afrah surname: Shafquat fullname: Shafquat, Afrah organization: Medidata, New York, NY, USA – sequence: 3 givenname: Jason surname: Mezey fullname: Mezey, Jason organization: Cornell University, Ithaca, NY, USA – sequence: 4 givenname: Jacob surname: Aptekar fullname: Aptekar, Jacob organization: Medidata, New York, NY, USA |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/37128411$$D View this record in MEDLINE/PubMed |
BookMark | eNo1kEtLxDAcxIOsuA_9CtKjl0DTvL1JfULBhS7orWSTfzVL261JW9hvb8H1MjOHHwMza7Tojh1coBXhXGOWSrGYs2YZ5lp-LtE6xkOaMsmVuEJLKkmmGCEr9FH6dmxMN8T7pDx1wzcM3iZ54ztvTZPsgp_10QwmmbxJynF_ADvgAiZokm3wk7EnvA0QIUy--zpXRB-v0WVtmgg3Z9-g8vlpl7_i4v3lLX8ocM8FwYrVVlpljbHOCTBAwQnNU1krSB23ktaOCb63nDAquFDW1U7VmQZHqdN0g-7-Wvtw_BkhDlXro4VmHgTHMVaZStW8WCsyo7dndNy34Ko--NaEU_V_Bf0FxNJe4A |
ContentType | Journal Article |
Copyright | 2022 AMIA - All rights reserved. |
Copyright_xml | – notice: 2022 AMIA - All rights reserved. |
DBID | CGR CUY CVF ECM EIF NPM 7X8 |
DatabaseName | Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed MEDLINE - Academic |
DatabaseTitle | MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic |
DatabaseTitleList | MEDLINE MEDLINE - Academic |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Medicine |
EISSN | 1559-4076 |
ExternalDocumentID | 37128411 |
Genre | Journal Article |
GroupedDBID | 2WC 53G ADBBV ALMA_UNASSIGNED_HOLDINGS BAWUL CGR CUY CVF DIK E3Z ECM EIF GX1 HYE M~E NPM OK1 RPM WOQ 7X8 |
ID | FETCH-LOGICAL-p561-84fc7c8caacdd6eae3ed69507f8e0d5c73fd465bc51436568cdfd8f29ed33d93 |
ISSN | 1942-597X |
IngestDate | Thu Jul 10 23:57:55 EDT 2025 Sat Sep 28 08:13:21 EDT 2024 |
IsPeerReviewed | true |
IsScholarly | true |
Language | English |
License | 2022 AMIA - All rights reserved. |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-p561-84fc7c8caacdd6eae3ed69507f8e0d5c73fd465bc51436568cdfd8f29ed33d93 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
PMID | 37128411 |
PQID | 2808586981 |
PQPubID | 23479 |
ParticipantIDs | proquest_miscellaneous_2808586981 pubmed_primary_37128411 |
PublicationCentury | 2000 |
PublicationDate | 2022-00-00 20220101 |
PublicationDateYYYYMMDD | 2022-01-01 |
PublicationDate_xml | – year: 2022 text: 2022-00-00 |
PublicationDecade | 2020 |
PublicationPlace | United States |
PublicationPlace_xml | – name: United States |
PublicationTitle | AMIA ... Annual Symposium proceedings |
PublicationTitleAlternate | AMIA Annu Symp Proc |
PublicationYear | 2022 |
SSID | ssj0047586 |
Score | 2.252436 |
Snippet | Clinical trials capture high-quality data for millions of patients each year, yet these data are largely unavailable for research beyond the scope of any... |
SourceID | proquest pubmed |
SourceType | Aggregation Database Index Database |
StartPage | 231 |
SubjectTerms | Confidentiality Data Accuracy Humans Information Dissemination - methods Privacy |
Title | Simulants: Synthetic Clinical Trial Data via Subject-Level Privacy-Preserving Synthesis |
URI | https://www.ncbi.nlm.nih.gov/pubmed/37128411 https://www.proquest.com/docview/2808586981 |
Volume | 2022 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnZ1Rb9MwEMetaQ8TLwgGjA2GjLS3KNFo4sThrdpAo6ITUorWt8qxzxBtzcqaTOo-PXdxknaCSbCXqHWiPPgnn_93uTszdgQavWQRCt9qOPYjE-e4pJKBn-PWbY2McqmoOHl8Hp99j0ZTMV0fgthUl1R5oO_-WlfyGKo4hlypSvY_yPYvxQH8jXzxioTx-k-Ms2JeX1EiC_n12apEMUf9V0-6asdJcyTHqaqUd1soMhIUdfG_UqKQ9-2muFV65VMSBhkM6szdvGJZLDcl63D8ZegFQeC1rfiz1Zwyveq5t9781rF2KH4UrgaoNEU_nP1U9letXIsDuxGDHsOdi5mP1HKdEDBcVHDpMr9HaLHzzdCEqy8OoDWkIkXf1J3t0lna_pnWWDr7vwFqMW9IhQntmq0Rvt8Nu7tF7QrQmlN197TP6YnQ-aEzqbqHHvYaGvUwecaetrKfDx3D52wLyl22M24TG16wix7lR96D5B1I3oDkBJIjSH4PJP8TJO9BvmTZ50-TkzO_PfPCX6CS9WVkdaKlVkobE4OCEEycoma3Eo6N0AkuoCgWuSadi1JcaoMryg5SMGFo0vAV2y6vS3jNuFYgUHqmQqFDrK3IB3FiFSTowuLfONpn77u5maFFoc9EqoTrejkbSJThMk7lh3225yZttnCtT2bdzB48eOcNe0KYXZTqLduubmo4RN1W5e8aVr8BzjRKrQ |
linkProvider | Geneva Foundation for Medical Education and Research |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Simulants%3A+Synthetic+Clinical+Trial+Data+via+Subject-Level+Privacy-Preserving+Synthesis&rft.jtitle=AMIA+...+Annual+Symposium+proceedings&rft.au=Beigi%2C+Mandis&rft.au=Shafquat%2C+Afrah&rft.au=Mezey%2C+Jason&rft.au=Aptekar%2C+Jacob&rft.date=2022&rft.eissn=1559-4076&rft.volume=2022&rft.spage=231&rft_id=info%3Apmid%2F37128411&rft.externalDocID=37128411 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1942-597X&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1942-597X&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1942-597X&client=summon |