Role of the Pretraining and the Adaptation data sizes for low-resource real-time MRI video segmentation

Real-time Magnetic Resonance Imaging (rtMRI) is frequently used in speech production studies as it provides a complete view of the vocal tract during articulation. This study investigates the effectiveness of rtMRI in analyzing vocal tract movements by employing the SegNet and UNet models for Air-Ti...

Full description

Saved in:
Bibliographic Details
Published inProceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) pp. 1 - 5
Main Authors Tholan, Masoud Thajudeen, Hegde, Vinayaka, Sharma, Chetan, Ghosh, Prasanta Kumar
Format Conference Proceeding
LanguageEnglish
Published IEEE 06.04.2025
Subjects
Online AccessGet full text
ISSN2379-190X
DOI10.1109/ICASSP49660.2025.10889096

Cover

Loading…
Abstract Real-time Magnetic Resonance Imaging (rtMRI) is frequently used in speech production studies as it provides a complete view of the vocal tract during articulation. This study investigates the effectiveness of rtMRI in analyzing vocal tract movements by employing the SegNet and UNet models for Air-Tissue Boundary (ATB) segmentation tasks. We conducted pretraining of a few base models using increasing numbers of subjects and videos, to assess performance on two datasets. First, consisting of unseen subjects with unseen videos from the same data source, achieving 0.33% and 0.91% (Pixel-wise Classification Accuracy (PCA) and Dice Coefficient respectively) better than its matched condition. Second, comprising unseen videos from a new data source, where we obtained an accuracy of 99.63% and 98.09% (PCA and Dice Coefficient respectively) of its matched condition performance. Here, matched condition performance refers to the performance of a model trained only on the test subjects which was set as a benchmark for the other models. Our findings highlight the significance of fine-tuning and adapting models with limited data. Notably, we demonstrated that effective model adaptation can be achieved with as few as 15 rtMRI frames from any new dataset.
AbstractList Real-time Magnetic Resonance Imaging (rtMRI) is frequently used in speech production studies as it provides a complete view of the vocal tract during articulation. This study investigates the effectiveness of rtMRI in analyzing vocal tract movements by employing the SegNet and UNet models for Air-Tissue Boundary (ATB) segmentation tasks. We conducted pretraining of a few base models using increasing numbers of subjects and videos, to assess performance on two datasets. First, consisting of unseen subjects with unseen videos from the same data source, achieving 0.33% and 0.91% (Pixel-wise Classification Accuracy (PCA) and Dice Coefficient respectively) better than its matched condition. Second, comprising unseen videos from a new data source, where we obtained an accuracy of 99.63% and 98.09% (PCA and Dice Coefficient respectively) of its matched condition performance. Here, matched condition performance refers to the performance of a model trained only on the test subjects which was set as a benchmark for the other models. Our findings highlight the significance of fine-tuning and adapting models with limited data. Notably, we demonstrated that effective model adaptation can be achieved with as few as 15 rtMRI frames from any new dataset.
Author Hegde, Vinayaka
Sharma, Chetan
Tholan, Masoud Thajudeen
Ghosh, Prasanta Kumar
Author_xml – sequence: 1
  givenname: Masoud Thajudeen
  surname: Tholan
  fullname: Tholan, Masoud Thajudeen
  email: masoudt@iisc.ac.in
  organization: Indian Institute of Science,Department of Electrical Engineering,Bengaluru,India
– sequence: 2
  givenname: Vinayaka
  surname: Hegde
  fullname: Hegde, Vinayaka
  email: vinayakahegde619@gmail.com
  organization: Indian Institute of Science,Department of Electrical Engineering,Bengaluru,India
– sequence: 3
  givenname: Chetan
  surname: Sharma
  fullname: Sharma, Chetan
  email: chetansharma@iisc.ac.in
  organization: Indian Institute of Science,Department of Electrical Engineering,Bengaluru,India
– sequence: 4
  givenname: Prasanta Kumar
  surname: Ghosh
  fullname: Ghosh, Prasanta Kumar
  email: prasantg@iisc.ac.in
  organization: Indian Institute of Science,Department of Electrical Engineering,Bengaluru,India
BookMark eNo1kNtKAzEYhKMoaGvfwIv4AFv_HHazuSzFQ6FiaXvhXclu_q2RbVKSqOjTW6xeDcwwA98MyJkPHgm5YTBmDPTtbDpZrRZSVxWMOfByzKCuNejqhIy00rUoQVS1kuyUXHKhdME0vFyQQUpvAHAI6kuyXYYeaehofkW6iJijcd75LTXe_noTa_bZZBc8tSYbmtw3JtqFSPvwWURM4T22SCOavshuh_RpOaMfzmKgCbc79MfyFTnvTJ9w9KdDsr6_W08fi_nzw4FjXjgtciEVMNl2yFBpBWXTtDUYKwU78HKlpeYKGmZLaUCLqmyMaErOOLadRW6sFUNyfZx1iLjZR7cz8Wvz_4v4AV-SWxA
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/ICASSP49660.2025.10889096
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISBN 9798350368741
EISSN 2379-190X
EndPage 5
ExternalDocumentID 10889096
Genre orig-research
GroupedDBID 23M
6IE
6IF
6IH
6IK
6IL
6IM
6IN
AAJGR
AAWTH
ABLEC
ACGFS
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
IPLJI
M43
OCL
RIE
RIL
RIO
RNS
ID FETCH-LOGICAL-i93t-47014cfe1e79705bbc80ad43111027949270b1d54a09365ba3b5212ecfde2add3
IEDL.DBID RIE
IngestDate Wed Jul 30 06:10:24 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i93t-47014cfe1e79705bbc80ad43111027949270b1d54a09365ba3b5212ecfde2add3
PageCount 5
ParticipantIDs ieee_primary_10889096
PublicationCentury 2000
PublicationDate 2025-April-6
PublicationDateYYYYMMDD 2025-04-06
PublicationDate_xml – month: 04
  year: 2025
  text: 2025-April-6
  day: 06
PublicationDecade 2020
PublicationTitle Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998)
PublicationTitleAbbrev ICASSP
PublicationYear 2025
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0008748
Score 2.288335
Snippet Real-time Magnetic Resonance Imaging (rtMRI) is frequently used in speech production studies as it provides a complete view of the vocal tract during...
SourceID ieee
SourceType Publisher
StartPage 1
SubjectTerms Accuracy
Adaptation models
Air-Tissue Boundary Segmentation
Atmospheric modeling
Data models
Fine-tuning
Image segmentation
Magnetic resonance imaging
Principal component analysis
Real-Time Magnetic Resonance Imaging
Real-time systems
SegNet
Soft sensors
U-Net
Videos
Title Role of the Pretraining and the Adaptation data sizes for low-resource real-time MRI video segmentation
URI https://ieeexplore.ieee.org/document/10889096
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LawIxEA6th9Je-rL0TQq9ZrtPszmKVLSgiFrwJptkVqR2V3Sl4K_vZB_2AYXelrBhl0lm5ktmvhlCHgUYJy4k86VWzGB0JkM7ZjaHUDiujEXO8O71G51X_2USTEqyes6FAYA8-Qws85jH8nWqNuaqDDU8DAVi7n2yj_usIGvtzG7I_fCAPJRFNJ-6reZoNPBN8Uk8BbqBVU3-0UYl9yLtY9Kvvl8kj7xZm0xaavurNOO_f_CE1L8Ie3Swc0WnZA-SM3L0rdbgOZkN0wXQNKaI-PBdqHpD0CjR-VhTR8siLk9N2ihdz7ewpohp6SL9YKvymp8iyFww05Ge9oZdalh8KV3D7L3kMCV1Mm4_j1sdVnZZYHPhZczneEhSMTjABbcDKVVoRxphBUrRRWUVLrelowM_soXXCGTkSUP3BRVrcNE4eheklqQJXBIq7YiDwx3QJnaI05SOFVpU7YcQoa5fkboR2XRZ1NGYVtK6_mP8hhyalcvzZBq3pJatNnCHECCT9_nSfwII1rES
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3ra8IwEA-bgz2-7OXYexnsa1yftvkoMtFNRdSB36RpriJzrWhl4F-_S9q6Bwz2rYQGwiV390vufneEPHBQTpwL5ggZMoXRmfCNiBke-Ny0RMQ1w7vTrTZfneeRO8rJ6poLAwA6-Qwq6lPH8mUSrtRTGWq473PE3NtkBx2_42Z0rY3h9T3H3yX3eRnNx1a9Nhj0HFV-Eu-Bllsppv9opKL9SOOQdIsVZOkjb5VVKirh-ldxxn8v8YiUvyh7tLdxRsdkC-ITcvCt2uApmfSTGdAkooj58F8oukPQIJZ6rCaDeRaZpypxlC6na1hSRLV0lnywRf7QTxFmzpjqSU87_RZVPL6ELmHynrOY4jIZNp6G9SbL-yywKbdT5nh4TQojMMHjnuEKEfpGIBFYoBQtVFdueYYwpesEBrerrghsoQi_EEYSLDSP9hkpxUkM54QKI_DA9EyQKnqI00IZhWhTpeNDgNp-QcpKZON5VkljXEjr8o_xO7LXHHba43ar-3JF9tUu6qyZ6jUppYsV3CAgSMWtPgafe3e0Xw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+of+the+...+IEEE+International+Conference+on+Acoustics%2C+Speech+and+Signal+Processing+%281998%29&rft.atitle=Role+of+the+Pretraining+and+the+Adaptation+data+sizes+for+low-resource+real-time+MRI+video+segmentation&rft.au=Tholan%2C+Masoud+Thajudeen&rft.au=Hegde%2C+Vinayaka&rft.au=Sharma%2C+Chetan&rft.au=Ghosh%2C+Prasanta+Kumar&rft.date=2025-04-06&rft.pub=IEEE&rft.eissn=2379-190X&rft.spage=1&rft.epage=5&rft_id=info:doi/10.1109%2FICASSP49660.2025.10889096&rft.externalDocID=10889096