Autodelineation of Treatment Target Volume for Radiation Therapy Using Large Language Model-Aided Multimodal Learning

Artificial intelligence-aided methods have made significant progress in the auto-delineation of normal tissues. However, these approaches struggle with the auto-contouring of radiation therapy target volume. Our goal was to model the delineation of target volume as a clinical decision-making problem...

Full description

Saved in:

Bibliographic Details
Published in	International journal of radiation oncology, biology, physics Vol. 121; no. 1; p. 230
Main Authors	Rajendran, Praveenbalaji, Chen, Yizheng, Qiu, Liang, Niedermayr, Thomas, Liu, Wu, Buyyounouski, Mark, Bagshaw, Hilary, Han, Bin, Yang, Yong, Kovalchuk, Nataliya, Gu, Xuejun, Hancock, Steven, Xing, Lei, Dai, Xianjin
Format	Journal Article
Language	English
Published	United States 01.01.2025
Subjects	Algorithms Artificial Intelligence Humans Male Oropharyngeal Neoplasms - radiotherapy Prostatic Neoplasms - diagnostic imaging Prostatic Neoplasms - radiotherapy Radiotherapy Planning, Computer-Assisted - methods Tumor Burden
Online Access	Get full text

Cover

Loading…

Abstract	Artificial intelligence-aided methods have made significant progress in the auto-delineation of normal tissues. However, these approaches struggle with the auto-contouring of radiation therapy target volume. Our goal was to model the delineation of target volume as a clinical decision-making problem, resolved by leveraging large language model-aided multimodal learning approaches. A vision-language model, termed Medformer, has been developed, employing the hierarchical vision transformer as its backbone and incorporating large language models to extract text-rich features. The contextually embedded linguistic features are seamlessly integrated into visual features for language-aware visual encoding through the visual language attention module. Metrics, including Dice similarity coefficient (DSC), intersection over union (IOU), and 95th percentile Hausdorff distance (HD95), were used to quantitatively evaluate the performance of our model. The evaluation was conducted on an in-house prostate cancer data set and a public oropharyngeal carcinoma data set, totaling 668 subjects. Our Medformer achieved a DSC of 0.81 ± 0.10 versus 0.72 ± 0.10, IOU of 0.73 ± 0.12 versus 0.65 ± 0.09, and HD95 of 9.86 ± 9.77 mm versus 19.13 ± 12.96 mm for delineation of gross tumor volume on the prostate cancer dataset. Similarly, on the oropharyngeal carcinoma dataset, it achieved a DSC of 0.77 ± 0.11 versus 0.72 ± 0.09, IOU of 0.70 ± 0.09 versus 0.65 ± 0.07, and HD95 of 7.52 ± 4.8 mm versus 13.63 ± 7.13 mm, representing significant improvements (P < 0.05). For delineating the clinical target volume, Medformer achieved a DSC of 0.91 ± 0.04, IOU of 0.85 ± 0.05, and HD95 of 2.98 ± 1.60 mm, comparable with other state-of-the-art algorithms. Auto-delineation of the treatment target based on multimodal learning outperforms conventional approaches that rely purely on visual features. Our method could be adopted into routine practice to rapidly contour clinical target volume/gross tumor volume.
AbstractList	Artificial intelligence-aided methods have made significant progress in the auto-delineation of normal tissues. However, these approaches struggle with the auto-contouring of radiation therapy target volume. Our goal was to model the delineation of target volume as a clinical decision-making problem, resolved by leveraging large language model-aided multimodal learning approaches. A vision-language model, termed Medformer, has been developed, employing the hierarchical vision transformer as its backbone and incorporating large language models to extract text-rich features. The contextually embedded linguistic features are seamlessly integrated into visual features for language-aware visual encoding through the visual language attention module. Metrics, including Dice similarity coefficient (DSC), intersection over union (IOU), and 95th percentile Hausdorff distance (HD95), were used to quantitatively evaluate the performance of our model. The evaluation was conducted on an in-house prostate cancer data set and a public oropharyngeal carcinoma data set, totaling 668 subjects. Our Medformer achieved a DSC of 0.81 ± 0.10 versus 0.72 ± 0.10, IOU of 0.73 ± 0.12 versus 0.65 ± 0.09, and HD95 of 9.86 ± 9.77 mm versus 19.13 ± 12.96 mm for delineation of gross tumor volume on the prostate cancer dataset. Similarly, on the oropharyngeal carcinoma dataset, it achieved a DSC of 0.77 ± 0.11 versus 0.72 ± 0.09, IOU of 0.70 ± 0.09 versus 0.65 ± 0.07, and HD95 of 7.52 ± 4.8 mm versus 13.63 ± 7.13 mm, representing significant improvements (P < 0.05). For delineating the clinical target volume, Medformer achieved a DSC of 0.91 ± 0.04, IOU of 0.85 ± 0.05, and HD95 of 2.98 ± 1.60 mm, comparable with other state-of-the-art algorithms. Auto-delineation of the treatment target based on multimodal learning outperforms conventional approaches that rely purely on visual features. Our method could be adopted into routine practice to rapidly contour clinical target volume/gross tumor volume. Artificial intelligence-aided methods have made significant progress in the auto-delineation of normal tissues. However, these approaches struggle with the auto-contouring of radiation therapy target volume. Our goal was to model the delineation of target volume as a clinical decision-making problem, resolved by leveraging large language model-aided multimodal learning approaches.PURPOSEArtificial intelligence-aided methods have made significant progress in the auto-delineation of normal tissues. However, these approaches struggle with the auto-contouring of radiation therapy target volume. Our goal was to model the delineation of target volume as a clinical decision-making problem, resolved by leveraging large language model-aided multimodal learning approaches.A vision-language model, termed Medformer, has been developed, employing the hierarchical vision transformer as its backbone and incorporating large language models to extract text-rich features. The contextually embedded linguistic features are seamlessly integrated into visual features for language-aware visual encoding through the visual language attention module. Metrics, including Dice similarity coefficient (DSC), intersection over union (IOU), and 95th percentile Hausdorff distance (HD95), were used to quantitatively evaluate the performance of our model. The evaluation was conducted on an in-house prostate cancer data set and a public oropharyngeal carcinoma data set, totaling 668 subjects.METHODS AND MATERIALSA vision-language model, termed Medformer, has been developed, employing the hierarchical vision transformer as its backbone and incorporating large language models to extract text-rich features. The contextually embedded linguistic features are seamlessly integrated into visual features for language-aware visual encoding through the visual language attention module. Metrics, including Dice similarity coefficient (DSC), intersection over union (IOU), and 95th percentile Hausdorff distance (HD95), were used to quantitatively evaluate the performance of our model. The evaluation was conducted on an in-house prostate cancer data set and a public oropharyngeal carcinoma data set, totaling 668 subjects.Our Medformer achieved a DSC of 0.81 ± 0.10 versus 0.72 ± 0.10, IOU of 0.73 ± 0.12 versus 0.65 ± 0.09, and HD95 of 9.86 ± 9.77 mm versus 19.13 ± 12.96 mm for delineation of gross tumor volume on the prostate cancer dataset. Similarly, on the oropharyngeal carcinoma dataset, it achieved a DSC of 0.77 ± 0.11 versus 0.72 ± 0.09, IOU of 0.70 ± 0.09 versus 0.65 ± 0.07, and HD95 of 7.52 ± 4.8 mm versus 13.63 ± 7.13 mm, representing significant improvements (P < 0.05). For delineating the clinical target volume, Medformer achieved a DSC of 0.91 ± 0.04, IOU of 0.85 ± 0.05, and HD95 of 2.98 ± 1.60 mm, comparable with other state-of-the-art algorithms.RESULTSOur Medformer achieved a DSC of 0.81 ± 0.10 versus 0.72 ± 0.10, IOU of 0.73 ± 0.12 versus 0.65 ± 0.09, and HD95 of 9.86 ± 9.77 mm versus 19.13 ± 12.96 mm for delineation of gross tumor volume on the prostate cancer dataset. Similarly, on the oropharyngeal carcinoma dataset, it achieved a DSC of 0.77 ± 0.11 versus 0.72 ± 0.09, IOU of 0.70 ± 0.09 versus 0.65 ± 0.07, and HD95 of 7.52 ± 4.8 mm versus 13.63 ± 7.13 mm, representing significant improvements (P < 0.05). For delineating the clinical target volume, Medformer achieved a DSC of 0.91 ± 0.04, IOU of 0.85 ± 0.05, and HD95 of 2.98 ± 1.60 mm, comparable with other state-of-the-art algorithms.Auto-delineation of the treatment target based on multimodal learning outperforms conventional approaches that rely purely on visual features. Our method could be adopted into routine practice to rapidly contour clinical target volume/gross tumor volume.CONCLUSIONSAuto-delineation of the treatment target based on multimodal learning outperforms conventional approaches that rely purely on visual features. Our method could be adopted into routine practice to rapidly contour clinical target volume/gross tumor volume.
Author	Liu, Wu Hancock, Steven Kovalchuk, Nataliya Qiu, Liang Dai, Xianjin Rajendran, Praveenbalaji Gu, Xuejun Chen, Yizheng Niedermayr, Thomas Bagshaw, Hilary Buyyounouski, Mark Han, Bin Yang, Yong Xing, Lei
Author_xml	– sequence: 1 givenname: Praveenbalaji surname: Rajendran fullname: Rajendran, Praveenbalaji organization: Department of Radiation Oncology, Stanford University, Stanford, California – sequence: 2 givenname: Yizheng surname: Chen fullname: Chen, Yizheng organization: Department of Radiation Oncology, Stanford University, Stanford, California – sequence: 3 givenname: Liang surname: Qiu fullname: Qiu, Liang organization: Department of Radiation Oncology, Stanford University, Stanford, California – sequence: 4 givenname: Thomas surname: Niedermayr fullname: Niedermayr, Thomas organization: Department of Radiation Oncology, Stanford University, Stanford, California – sequence: 5 givenname: Wu surname: Liu fullname: Liu, Wu organization: Department of Radiation Oncology, Stanford University, Stanford, California – sequence: 6 givenname: Mark surname: Buyyounouski fullname: Buyyounouski, Mark organization: Department of Radiation Oncology, Stanford University, Stanford, California – sequence: 7 givenname: Hilary surname: Bagshaw fullname: Bagshaw, Hilary organization: Department of Radiation Oncology, Stanford University, Stanford, California – sequence: 8 givenname: Bin surname: Han fullname: Han, Bin organization: Department of Radiation Oncology, Stanford University, Stanford, California – sequence: 9 givenname: Yong surname: Yang fullname: Yang, Yong organization: Department of Radiation Oncology, Stanford University, Stanford, California – sequence: 10 givenname: Nataliya surname: Kovalchuk fullname: Kovalchuk, Nataliya organization: Department of Radiation Oncology, Stanford University, Stanford, California – sequence: 11 givenname: Xuejun surname: Gu fullname: Gu, Xuejun organization: Department of Radiation Oncology, Stanford University, Stanford, California – sequence: 12 givenname: Steven surname: Hancock fullname: Hancock, Steven organization: Department of Radiation Oncology, Stanford University, Stanford, California – sequence: 13 givenname: Lei surname: Xing fullname: Xing, Lei organization: Department of Radiation Oncology, Stanford University, Stanford, California – sequence: 14 givenname: Xianjin surname: Dai fullname: Dai, Xianjin email: xjdai@stanford.edu organization: Department of Radiation Oncology, Stanford University, Stanford, California. Electronic address: xjdai@stanford.edu
BackLink	https://www.ncbi.nlm.nih.gov/pubmed/39117164$$D View this record in MEDLINE/PubMed
BookMark	eNpNkEtLw0AUhQep2If-BGWWbhLnlUlnWYovaBEkFXdhkrmpU5KZOEkW_femVMHNPd_i48C5czRx3gFCt5TElFD5cIjtIfiijRlhIiZpzKhQF2hGl6mKeJJ8Tv7xFM277kAIoTQVV2jK1QhUihkaVkPvDdTWge6td9hXOAsjN-B6nOmwhx5_-HpoAFc-4Hdt7FnMviDo9oh3nXV7vDmZ43X7QY-wPXVGK2vA4O1Q97bxRtd4Azq4Ub9Gl5WuO7j5zQXaPT1m65do8_b8ul5topZR2kfMyKpgFdM8MZwYU6VLLQwwybnWvAJQnAuAskyMLFWaFCCJpoUQqhgHFglfoPtzbxv89wBdnze2K6GutQM_dDkniighiSSjeverDkUDJm-DbXQ45n-v4j-OznFm
ContentType	Journal Article
Copyright	Copyright © 2024 Elsevier Inc. All rights reserved.
Copyright_xml	– notice: Copyright © 2024 Elsevier Inc. All rights reserved.
DBID	CGR CUY CVF ECM EIF NPM 7X8
DOI	10.1016/j.ijrobp.2024.07.2149
DatabaseName	Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed MEDLINE - Academic
DatabaseTitle	MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic
DatabaseTitleList	MEDLINE MEDLINE - Academic
Database_xml	– sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database
DeliveryMethod	fulltext_linktorsrc
Discipline	Medicine
EISSN	1879-355X
ExternalDocumentID	39117164
Genre	Journal Article
GroupedDBID	--- --K .1- .FO 0R~ 1B1 1P~ 1RT 1~5 4.4 457 4G. 5RE 7-5 AAEDT AAEDW AALRI AAWTL AAXUO ABJNI ABLJU ABNEU ABOCM ABUDA ACGFS ACIUM ADBBV AENEX AEVXI AFCTW AFJKZ AFRHN AFTJW AHHHB AITUG AJUYK AKRWK ALMA_UNASSIGNED_HOLDINGS AMRAJ BELOY CGR CUY CVF DU5 EBS ECM EFJIC EIF F5P FDB GBLVA HED HMO IHE J1W KOM LX3 M41 MO0 NPM O9- OC~ OO- RNS ROL RPZ SDG SEL SES SEW SSZ UV1 XH2 Z5R ~S- 7X8 AGCQF EFKBS
ID	FETCH-LOGICAL-p211t-2d6fb2f2a35d30ddf78a4de2633aa3fee9334eecc5d6c975be60a1b449b911b53
ISSN	1879-355X
IngestDate	Mon Jul 21 09:36:02 EDT 2025 Wed Feb 19 02:02:47 EST 2025
IsPeerReviewed	true
IsScholarly	true
Issue	1
Language	English
License	Copyright © 2024 Elsevier Inc. All rights reserved.
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-p211t-2d6fb2f2a35d30ddf78a4de2633aa3fee9334eecc5d6c975be60a1b449b911b53
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
PMID	39117164
PQID	3090946060
PQPubID	23479
ParticipantIDs	proquest_miscellaneous_3090946060 pubmed_primary_39117164
PublicationCentury	2000
PublicationDate	2025-Jan-01 20250101
PublicationDateYYYYMMDD	2025-01-01
PublicationDate_xml	– month: 01 year: 2025 text: 2025-Jan-01 day: 01
PublicationDecade	2020
PublicationPlace	United States
PublicationPlace_xml	– name: United States
PublicationTitle	International journal of radiation oncology, biology, physics
PublicationTitleAlternate	Int J Radiat Oncol Biol Phys
PublicationYear	2025
References	39040646 - ArXiv. 2024 Jul 10:arXiv:2407.07296v1.
References_xml	– reference: 39040646 - ArXiv. 2024 Jul 10:arXiv:2407.07296v1.
SSID	ssj0001174
Score	2.4957104
Snippet	Artificial intelligence-aided methods have made significant progress in the auto-delineation of normal tissues. However, these approaches struggle with the...
SourceID	proquest pubmed
SourceType	Aggregation Database Index Database
StartPage	230
SubjectTerms	Algorithms Artificial Intelligence Humans Male Oropharyngeal Neoplasms - radiotherapy Prostatic Neoplasms - diagnostic imaging Prostatic Neoplasms - radiotherapy Radiotherapy Planning, Computer-Assisted - methods Tumor Burden
Title	Autodelineation of Treatment Target Volume for Radiation Therapy Using Large Language Model-Aided Multimodal Learning
URI	https://www.ncbi.nlm.nih.gov/pubmed/39117164 https://www.proquest.com/docview/3090946060
Volume	121
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3fa9swEBZZCqUvZb_Xdhsa7K24syXZsR_D2CjrWkZxt-wpSJE8HBo7uPZD-3f1D9xJsmWna2EbGGEkI4Pu8-nu890Jofdc-CKGywNrl4CDwpSnzRKPq5hGlCYiMI7i6Vl0fMG-zMLZaHQ7iFpqanG0uLk3r-R_pAp9IFedJfsPknWTQgfcg3yhBQlD-1cynja1PsgGDEVn-KUucDw1Md6H3436MdGE57oOgXkwtbUEDm3AwFf9JLSWuTTHo11601yCLWryc1el1HR3S6IMzdlNPnFQhaJyryqLhUuIEX1ujGVU-kB7vlSFrCwb-00fiaQKwS_5Mu_jD6yC_JnfwN0vR9jmjaUWeN93lusKGSt-Xd0JgWrZDRIO2A1lNXI8STwwimYbKttmVW9gs1XA9i_PHxuD5SiWR_myKoWuU0qYLtpKAlswdQCW9cqghcIuoH3Jfp900Yvd0CO0RcA5IWO0NT05_3HiLAAYZn222Id737qDtrt5HnZujJGTPka7rXeCpxZqT9BIFU_R9mkbf_EMNXcQh8sMO8RhizhsEYcBcdghDreIwwZx2CAOd4jDA8ThHnG4Q9xzdPH5U_rx2GsP7vDWJAhqj8goEyQjnIaS-lJmk5gzqQh8_JzTTKmEUqZAeYQyWiSTUKjI54FgLBGwJCKkL9C4KAv1CmGuYKJEhoESsN0IJmRAJ4uYSeYnVER8D73r1m4OilH_7eKFKpurOfUTP2Hgn_t76KVd1PnaVnCZdyu__-DIAdrp4fgajeuqUW_A_KzF21bavwGcwYxJ
linkProvider	Library Specific Holdings
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Autodelineation+of+Treatment+Target+Volume+for+Radiation+Therapy+Using+Large+Language+Model-Aided+Multimodal+Learning&rft.jtitle=International+journal+of+radiation+oncology%2C+biology%2C+physics&rft.au=Rajendran%2C+Praveenbalaji&rft.au=Chen%2C+Yizheng&rft.au=Qiu%2C+Liang&rft.au=Niedermayr%2C+Thomas&rft.date=2025-01-01&rft.eissn=1879-355X&rft.volume=121&rft.issue=1&rft.spage=230&rft_id=info:doi/10.1016%2Fj.ijrobp.2024.07.2149&rft_id=info%3Apmid%2F39117164&rft.externalDocID=39117164
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1879-355X&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1879-355X&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1879-355X&client=summon