Linguistic Processing of Unattended Speech Under a Cocktail Party Listening Scenario
In the context of a cocktail party listening environment, the processing of different linguistic hierarchy levels in unattended speech and their influence on target speech recognition remain controversial. This study aims to investigate how different levels of linguistic structures (such as syllable...
Saved in:
Published in | Journal of speech, language, and hearing research Vol. 68; no. 1; pp. 69 - 78 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
United States
02.01.2025
|
Subjects | |
Online Access | Get full text |
ISSN | 1092-4388 1558-9102 1558-9102 |
DOI | 10.1044/2024_JSLHR-24-00404 |
Cover
Loading…
Abstract | In the context of a cocktail party listening environment, the processing of different linguistic hierarchy levels in unattended speech and their influence on target speech recognition remain controversial. This study aims to investigate how different levels of linguistic structures (such as syllable, word, and sentence) in competing speech influence the recognition of target speech in a speech-to-speech masking situation.
Thirty-six participants were instructed to recognize target speech when it was masked by competing speech varied in masking types across syllables, words, and sentence. The perceived spatial location was altered to examine the interaction between linguistic unmasking effects and spatial unmasking effects. Recognition performance (i.e., intelligibility threshold) was determined by fitting psychometric functions to the recognition accuracies across four signal-to-noise ratios (-14, -10, -6, and - 2 dB) to evaluate each subject's ability to cope with challenging listening conditions.
We revealed a significant decline in target speech recognition when the masking speech was linguistically structured and intelligible. Specifically, masking speech with higher linguistic complexity, such as coherent sentences, resulted in more significant interference compared to those with lower complexity, like sequences of syllables. The linguistic release from masking, resulting from a decrease in linguistic complexity of maskers shifting from sentences to syllables, was found to be correlated with, and also linearly additive to, the spatial release from masking due to the spatial separation of the masker and target.
These findings illustrate the influence of linguistic complexity in masking speech on the recognition of target speech, suggesting the involvement of higher-level linguistic processing of irrelevant speech in noisy environment. |
---|---|
AbstractList | In the context of a cocktail party listening environment, the processing of different linguistic hierarchy levels in unattended speech and their influence on target speech recognition remain controversial. This study aims to investigate how different levels of linguistic structures (such as syllable, word, and sentence) in competing speech influence the recognition of target speech in a speech-to-speech masking situation.PURPOSEIn the context of a cocktail party listening environment, the processing of different linguistic hierarchy levels in unattended speech and their influence on target speech recognition remain controversial. This study aims to investigate how different levels of linguistic structures (such as syllable, word, and sentence) in competing speech influence the recognition of target speech in a speech-to-speech masking situation.Thirty-six participants were instructed to recognize target speech when it was masked by competing speech varied in masking types across syllables, words, and sentence. The perceived spatial location was altered to examine the interaction between linguistic unmasking effects and spatial unmasking effects. Recognition performance (i.e., intelligibility threshold) was determined by fitting psychometric functions to the recognition accuracies across four signal-to-noise ratios (-14, -10, -6, and - 2 dB) to evaluate each subject's ability to cope with challenging listening conditions.METHODThirty-six participants were instructed to recognize target speech when it was masked by competing speech varied in masking types across syllables, words, and sentence. The perceived spatial location was altered to examine the interaction between linguistic unmasking effects and spatial unmasking effects. Recognition performance (i.e., intelligibility threshold) was determined by fitting psychometric functions to the recognition accuracies across four signal-to-noise ratios (-14, -10, -6, and - 2 dB) to evaluate each subject's ability to cope with challenging listening conditions.We revealed a significant decline in target speech recognition when the masking speech was linguistically structured and intelligible. Specifically, masking speech with higher linguistic complexity, such as coherent sentences, resulted in more significant interference compared to those with lower complexity, like sequences of syllables. The linguistic release from masking, resulting from a decrease in linguistic complexity of maskers shifting from sentences to syllables, was found to be correlated with, and also linearly additive to, the spatial release from masking due to the spatial separation of the masker and target.RESULTSWe revealed a significant decline in target speech recognition when the masking speech was linguistically structured and intelligible. Specifically, masking speech with higher linguistic complexity, such as coherent sentences, resulted in more significant interference compared to those with lower complexity, like sequences of syllables. The linguistic release from masking, resulting from a decrease in linguistic complexity of maskers shifting from sentences to syllables, was found to be correlated with, and also linearly additive to, the spatial release from masking due to the spatial separation of the masker and target.These findings illustrate the influence of linguistic complexity in masking speech on the recognition of target speech, suggesting the involvement of higher-level linguistic processing of irrelevant speech in noisy environment.CONCLUSIONSThese findings illustrate the influence of linguistic complexity in masking speech on the recognition of target speech, suggesting the involvement of higher-level linguistic processing of irrelevant speech in noisy environment. In the context of a cocktail party listening environment, the processing of different linguistic hierarchy levels in unattended speech and their influence on target speech recognition remain controversial. This study aims to investigate how different levels of linguistic structures (such as syllable, word, and sentence) in competing speech influence the recognition of target speech in a speech-to-speech masking situation. Thirty-six participants were instructed to recognize target speech when it was masked by competing speech varied in masking types across syllables, words, and sentence. The perceived spatial location was altered to examine the interaction between linguistic unmasking effects and spatial unmasking effects. Recognition performance (i.e., intelligibility threshold) was determined by fitting psychometric functions to the recognition accuracies across four signal-to-noise ratios (-14, -10, -6, and - 2 dB) to evaluate each subject's ability to cope with challenging listening conditions. We revealed a significant decline in target speech recognition when the masking speech was linguistically structured and intelligible. Specifically, masking speech with higher linguistic complexity, such as coherent sentences, resulted in more significant interference compared to those with lower complexity, like sequences of syllables. The linguistic release from masking, resulting from a decrease in linguistic complexity of maskers shifting from sentences to syllables, was found to be correlated with, and also linearly additive to, the spatial release from masking due to the spatial separation of the masker and target. These findings illustrate the influence of linguistic complexity in masking speech on the recognition of target speech, suggesting the involvement of higher-level linguistic processing of irrelevant speech in noisy environment. |
Author | Chen, Liangjie Lu, Lingxi Wu, Danni Zhang, Xiaoyu |
Author_xml | – sequence: 1 givenname: Lingxi orcidid: 0000-0001-5090-4663 surname: Lu fullname: Lu, Lingxi – sequence: 2 givenname: Danni surname: Wu fullname: Wu, Danni – sequence: 3 givenname: Xiaoyu surname: Zhang fullname: Zhang, Xiaoyu – sequence: 4 givenname: Liangjie surname: Chen fullname: Chen, Liangjie |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/39626028$$D View this record in MEDLINE/PubMed |
BookMark | eNpFkE1PwzAMhiM0xD7gFyChHrkU7CRNuiOagIEqMbHtXKWpC4WtHUl32L8nYwN8sfX6sQ_PkPWatiHGLhFuEKS85cBl_jzPpq8xlzGABHnCBpgkaTxG4L0ww5jHUqRpnw29_4BQKNUZ64ux4gp4OmCLrG7etrXvahvNXGvJ-xBEbRUtG9N11JRURvMNkX0PSUkuMtGktZ-dqVfRzLhuF2Xhmpr91dxSY1zdnrPTyqw8XRz7iC0f7heTaZy9PD5N7rLYooYuLhNNRmkJmoAKidIohWPkimNJlbaVAVUk3Agb9qjRorFCWU1CS0QoxIhdH_5uXPu1Jd_l69pbWq1MQ-3W5wJlMJCAFgG9OqLbYk1lvnH12rhd_msiAOIAWNd676j6QxDyve_833cehh_f4hu3jnJg |
Cites_doi | 10.1121/1.4928954 10.3766/jaaa.18.7.4 10.1523/JNEUROSCI.2606-17.2017 10.1037/0096-1523.29.1.172 10.1037/0096-1523.30.6.1077 10.1121/1.3693656 10.3758/s13414-020-02149-1 10.1073/pnas.1205381109 10.1016/j.specom.2007.05.008 10.1121/1.3479547 10.1038/nature11020 10.1121/1.1354984 10.1126/science.182.4108.177 10.3389/fnins.2021.643705 10.1523/ENEURO.0346-22.2023 10.1121/1.2945710 10.1016/j.crneur.2022.100043 10.1523/JNEUROSCI.1731-22.2023 10.1121/1.3458857 10.1121/1.1907229 10.7554/eLife.65096 10.1016/j.neuron.2012.12.037 10.1523/JNEUROSCI.3631-09.2010 10.1093/cercor/bhac424 10.1016/j.specom.2007.05.005 10.1017/S0033291715001828 10.1002/pchj.622 10.1121/1.428211 10.1121/1.4954748 10.1523/JNEUROSCI.3675-12.2013 10.1121/1.1510141 10.1016/j.cub.2018.10.042 10.1371/journal.pbio.3000883 10.1162/jocn_a_01303 10.1121/1.2804952 10.1093/cercor/bhy191 10.3389/fnhum.2016.00538 10.1044/1092-4388(2011/10-0282) 10.3758/s13414-018-1489-8 10.1044/2017_JSLHR-H-17-0215 10.1121/1.1917119 10.1038/s41593-020-0639-1 10.1073/pnas.90.18.8722 10.1037/xlm0000874 |
ContentType | Journal Article |
DBID | AAYXX CITATION CGR CUY CVF ECM EIF NPM 7X8 |
DOI | 10.1044/2024_JSLHR-24-00404 |
DatabaseName | CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed MEDLINE - Academic |
DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic |
DatabaseTitleList | MEDLINE - Academic MEDLINE |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Medicine Languages & Literatures Social Welfare & Social Work |
EISSN | 1558-9102 |
EndPage | 78 |
ExternalDocumentID | 39626028 10_1044_2024_JSLHR_24_00404 |
Genre | Journal Article |
GroupedDBID | --- --Z -W8 -~X .GJ .GO 0-V 04C 0R~ 18M 29L 36B 4.4 5GY 6NX 6PF 7RV 7X7 85S 8A4 8G5 8R4 8R5 AAHSB AAWTL AAYXX ABDBF ABIVO ABOPQ ABPPZ ABWJO ABZEH ACGFO ACGOD ACHQT ACNCT ACUHS ACUXI ADBBV ADOJX AENEX AERSA AFKRA AGHSJ AHMBA AIKWM ALIPV ALMA_UNASSIGNED_HOLDINGS ALSLI ARALO AZQEC BENPR BKEYQ BMSDO BPHCQ BVXVI CITATION CJNVE CPGLG CRLPW CS3 DU5 EAD EAP EAS EBD EBO EBS ECE ECF ECT EDJ EIHBH EMB EMK EMOBN ESX EX3 F5P F9R FJW FYUFA GUQSH H13 HCIFZ HZ~ I-F IAO ICO IEA IER IHR IHW IN- INH INIJC INR IOF IPO IPY M0P M1P M2M M2O M2P M2Q M2R MLAFT O9- P2P PADUT PCD PQQKQ PROAC PSQYO PSYQQ Q2X QF4 QM7 QN7 QO5 RWL S0X SJA SV3 TAE TH9 TN5 TUS TWZ UHB UKHRP UPT WH7 WOW WQ9 YQT ZCA ABDSA CGR CUY CVF ECM EIF NPM YCJ 7X8 |
ID | FETCH-LOGICAL-c170t-d57ea67407e0eb414a661912621def7cfa06b52a3ce0e171c1ac36c7e374110b3 |
ISSN | 1092-4388 1558-9102 |
IngestDate | Fri Jul 11 04:26:41 EDT 2025 Thu Apr 03 07:02:36 EDT 2025 Tue Jul 01 01:22:40 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 1 |
Language | English |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-c170t-d57ea67407e0eb414a661912621def7cfa06b52a3ce0e171c1ac36c7e374110b3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ORCID | 0000-0001-5090-4663 |
PMID | 39626028 |
PQID | 3140925073 |
PQPubID | 23479 |
PageCount | 10 |
ParticipantIDs | proquest_miscellaneous_3140925073 pubmed_primary_39626028 crossref_primary_10_1044_2024_JSLHR_24_00404 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2025-Jan-02 |
PublicationDateYYYYMMDD | 2025-01-02 |
PublicationDate_xml | – month: 01 year: 2025 text: 2025-Jan-02 day: 02 |
PublicationDecade | 2020 |
PublicationPlace | United States |
PublicationPlace_xml | – name: United States |
PublicationTitle | Journal of speech, language, and hearing research |
PublicationTitleAlternate | J Speech Lang Hear Res |
PublicationYear | 2025 |
References | e_1_3_2_27_1 e_1_3_2_28_1 e_1_3_2_29_1 e_1_3_2_42_1 e_1_3_2_20_1 e_1_3_2_41_1 e_1_3_2_21_1 e_1_3_2_22_1 e_1_3_2_43_1 e_1_3_2_23_1 e_1_3_2_46_1 e_1_3_2_24_1 e_1_3_2_45_1 e_1_3_2_25_1 e_1_3_2_26_1 e_1_3_2_47_1 Devlin J. (e_1_3_2_11_1) 2019 Yu S. (e_1_3_2_44_1) 2003; 13 e_1_3_2_40_1 e_1_3_2_16_1 e_1_3_2_39_1 e_1_3_2_9_1 e_1_3_2_17_1 e_1_3_2_38_1 e_1_3_2_8_1 e_1_3_2_18_1 e_1_3_2_7_1 e_1_3_2_19_1 e_1_3_2_2_1 e_1_3_2_31_1 e_1_3_2_30_1 e_1_3_2_10_1 e_1_3_2_33_1 e_1_3_2_32_1 e_1_3_2_6_1 e_1_3_2_12_1 e_1_3_2_35_1 e_1_3_2_5_1 e_1_3_2_13_1 e_1_3_2_34_1 e_1_3_2_4_1 e_1_3_2_14_1 e_1_3_2_37_1 e_1_3_2_3_1 e_1_3_2_15_1 e_1_3_2_36_1 |
References_xml | – ident: e_1_3_2_15_1 doi: 10.1121/1.4928954 – ident: e_1_3_2_36_1 doi: 10.3766/jaaa.18.7.4 – ident: e_1_3_2_12_1 doi: 10.1523/JNEUROSCI.2606-17.2017 – ident: e_1_3_2_14_1 doi: 10.1037/0096-1523.29.1.172 – volume: 13 start-page: 121 year: 2003 ident: e_1_3_2_44_1 article-title: Specification for corpus processing at Peking University: Word segmentation, POS tagging and phonetic notation publication-title: Journal of Chinese Language and Computing – ident: e_1_3_2_27_1 doi: 10.1037/0096-1523.30.6.1077 – ident: e_1_3_2_3_1 doi: 10.1121/1.3693656 – start-page: 4171 volume-title: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) year: 2019 ident: e_1_3_2_11_1 – ident: e_1_3_2_30_1 doi: 10.3758/s13414-020-02149-1 – ident: e_1_3_2_13_1 doi: 10.1073/pnas.1205381109 – ident: e_1_3_2_21_1 doi: 10.1016/j.specom.2007.05.008 – ident: e_1_3_2_23_1 doi: 10.1121/1.3479547 – ident: e_1_3_2_32_1 doi: 10.1038/nature11020 – ident: e_1_3_2_16_1 doi: 10.1121/1.1354984 – ident: e_1_3_2_20_1 doi: 10.1126/science.182.4108.177 – ident: e_1_3_2_22_1 doi: 10.3389/fnins.2021.643705 – ident: e_1_3_2_29_1 doi: 10.1523/ENEURO.0346-22.2023 – ident: e_1_3_2_31_1 doi: 10.1121/1.2945710 – ident: e_1_3_2_10_1 doi: 10.1016/j.crneur.2022.100043 – ident: e_1_3_2_6_1 doi: 10.1523/JNEUROSCI.1731-22.2023 – ident: e_1_3_2_7_1 doi: 10.1121/1.3458857 – ident: e_1_3_2_8_1 doi: 10.1121/1.1907229 – ident: e_1_3_2_19_1 doi: 10.7554/eLife.65096 – ident: e_1_3_2_47_1 doi: 10.1016/j.neuron.2012.12.037 – ident: e_1_3_2_24_1 doi: 10.1523/JNEUROSCI.3631-09.2010 – ident: e_1_3_2_34_1 doi: 10.1093/cercor/bhac424 – ident: e_1_3_2_43_1 doi: 10.1016/j.specom.2007.05.005 – ident: e_1_3_2_45_1 doi: 10.1017/S0033291715001828 – ident: e_1_3_2_26_1 doi: 10.1002/pchj.622 – ident: e_1_3_2_17_1 doi: 10.1121/1.428211 – ident: e_1_3_2_25_1 doi: 10.1121/1.4954748 – ident: e_1_3_2_46_1 doi: 10.1523/JNEUROSCI.3675-12.2013 – ident: e_1_3_2_2_1 doi: 10.1121/1.1510141 – ident: e_1_3_2_4_1 doi: 10.1016/j.cub.2018.10.042 – ident: e_1_3_2_5_1 doi: 10.1371/journal.pbio.3000883 – ident: e_1_3_2_33_1 doi: 10.1162/jocn_a_01303 – ident: e_1_3_2_9_1 doi: 10.1121/1.2804952 – ident: e_1_3_2_37_1 doi: 10.1093/cercor/bhy191 – ident: e_1_3_2_41_1 doi: 10.3389/fnhum.2016.00538 – ident: e_1_3_2_42_1 doi: 10.1044/1092-4388(2011/10-0282) – ident: e_1_3_2_28_1 doi: 10.3758/s13414-018-1489-8 – ident: e_1_3_2_38_1 doi: 10.1044/2017_JSLHR-H-17-0215 – ident: e_1_3_2_39_1 doi: 10.1121/1.1917119 – ident: e_1_3_2_18_1 doi: 10.1038/s41593-020-0639-1 – ident: e_1_3_2_40_1 doi: 10.1073/pnas.90.18.8722 – ident: e_1_3_2_35_1 doi: 10.1037/xlm0000874 |
SSID | ssj0000146 |
Score | 2.4419496 |
Snippet | In the context of a cocktail party listening environment, the processing of different linguistic hierarchy levels in unattended speech and their influence on... |
SourceID | proquest pubmed crossref |
SourceType | Aggregation Database Index Database |
StartPage | 69 |
SubjectTerms | Adult Female Humans Linguistics Male Noise Perceptual Masking - physiology Signal-To-Noise Ratio Speech Intelligibility - physiology Speech Perception - physiology Young Adult |
Title | Linguistic Processing of Unattended Speech Under a Cocktail Party Listening Scenario |
URI | https://www.ncbi.nlm.nih.gov/pubmed/39626028 https://www.proquest.com/docview/3140925073 |
Volume | 68 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9NAEF6FVkJcKghQAgUtEuJSXOz1ejc-tkAVVS4HmojcrPVmHUKRXSWxBPwWfiyzLzulLaJcLGsTTR7f59mZ2Xkg9IrIkhNS8IDNEhXQVMggFUkYyCEYD1yylJraqtOPbDShJ9Nk2uv92shaatbFgfx5bV3J_6AKa4CrrpK9BbKtUFiAe8AXroAwXP8JY3Ak543ptOwT_l0O86TSbTN1cFsPmFfyy76Zb7Qv4PmX5zppFEzHJdjfmUbZhEbOpKrAb65vsFZXRowGxEc4fd6nHomtBbi2QW14OWuczz__vmhVf-PK2qt2qY1YTxei_tF06QZWH2ZA3_nXhdqMTpDERCc2ApZRmOqqLDu670A5JZtoJRte0sJseIVtVqXaSS5uc7bjfq6o_ZBSXdkC9kZ-cpaNPgWEBlo90W6X8yf7f2x-bUqiOYynNO-E5HBjhNxB2wScEFD724dH74-ON9qT2eo1_xt9VytK317zXS5bPje4M8asGd9HOw5hfGjJ9QD1VNVHu5nDeIVf46xtvL3qo7unLhejj_ZsZTf-rL6VYqngnX6hXp4_ROOOnLgjJ65L3JETW3JiQ04ssCcnNuTELTmxJ-cjNDn-MH43CtwIj0BGPFwHs4QrwTgNuQpVQSMqwB5MI8JINFMll6UIWZEQEUt4PeKRjISMmeQqBks3Cov4Mdqq6ko9QZiWJGKqmMWF4HQoJfjFYkaGMqayLBgTA_TG_735he3Ukv8F1AF66SHIQaPqYzJRqbpZ5bHuAQeeAY8HaNdi0wqMUx0AIMOnt_uwZ-he93Dsoa31slHPwZhdFy8cp34DTRWa1w |
linkProvider | EBSCOhost |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Linguistic+Processing+of+Unattended+Speech+Under+a+Cocktail+Party+Listening+Scenario&rft.jtitle=Journal+of+speech%2C+language%2C+and+hearing+research&rft.au=Lu%2C+Lingxi&rft.au=Wu%2C+Danni&rft.au=Zhang%2C+Xiaoyu&rft.au=Chen%2C+Liangjie&rft.date=2025-01-02&rft.issn=1092-4388&rft.eissn=1558-9102&rft.volume=68&rft.issue=1&rft.spage=69&rft.epage=78&rft_id=info:doi/10.1044%2F2024_JSLHR-24-00404&rft.externalDBID=n%2Fa&rft.externalDocID=10_1044_2024_JSLHR_24_00404 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1092-4388&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1092-4388&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1092-4388&client=summon |