Linguistic Processing of Unattended Speech Under a Cocktail Party Listening Scenario

In the context of a cocktail party listening environment, the processing of different linguistic hierarchy levels in unattended speech and their influence on target speech recognition remain controversial. This study aims to investigate how different levels of linguistic structures (such as syllable...

Full description

Saved in:

Bibliographic Details
Published in	Journal of speech, language, and hearing research Vol. 68; no. 1; pp. 69 - 78
Main Authors	Lu, Lingxi, Wu, Danni, Zhang, Xiaoyu, Chen, Liangjie
Format	Journal Article
Language	English
Published	United States 02.01.2025
Subjects	Adult Female Humans Linguistics Male Noise Perceptual Masking - physiology Signal-To-Noise Ratio Speech Intelligibility - physiology Speech Perception - physiology Young Adult
Online Access	Get full text
ISSN	1092-4388 1558-9102 1558-9102
DOI	10.1044/2024_JSLHR-24-00404

Cover

Loading…

Abstract	In the context of a cocktail party listening environment, the processing of different linguistic hierarchy levels in unattended speech and their influence on target speech recognition remain controversial. This study aims to investigate how different levels of linguistic structures (such as syllable, word, and sentence) in competing speech influence the recognition of target speech in a speech-to-speech masking situation. Thirty-six participants were instructed to recognize target speech when it was masked by competing speech varied in masking types across syllables, words, and sentence. The perceived spatial location was altered to examine the interaction between linguistic unmasking effects and spatial unmasking effects. Recognition performance (i.e., intelligibility threshold) was determined by fitting psychometric functions to the recognition accuracies across four signal-to-noise ratios (-14, -10, -6, and - 2 dB) to evaluate each subject's ability to cope with challenging listening conditions. We revealed a significant decline in target speech recognition when the masking speech was linguistically structured and intelligible. Specifically, masking speech with higher linguistic complexity, such as coherent sentences, resulted in more significant interference compared to those with lower complexity, like sequences of syllables. The linguistic release from masking, resulting from a decrease in linguistic complexity of maskers shifting from sentences to syllables, was found to be correlated with, and also linearly additive to, the spatial release from masking due to the spatial separation of the masker and target. These findings illustrate the influence of linguistic complexity in masking speech on the recognition of target speech, suggesting the involvement of higher-level linguistic processing of irrelevant speech in noisy environment.
AbstractList	In the context of a cocktail party listening environment, the processing of different linguistic hierarchy levels in unattended speech and their influence on target speech recognition remain controversial. This study aims to investigate how different levels of linguistic structures (such as syllable, word, and sentence) in competing speech influence the recognition of target speech in a speech-to-speech masking situation.PURPOSEIn the context of a cocktail party listening environment, the processing of different linguistic hierarchy levels in unattended speech and their influence on target speech recognition remain controversial. This study aims to investigate how different levels of linguistic structures (such as syllable, word, and sentence) in competing speech influence the recognition of target speech in a speech-to-speech masking situation.Thirty-six participants were instructed to recognize target speech when it was masked by competing speech varied in masking types across syllables, words, and sentence. The perceived spatial location was altered to examine the interaction between linguistic unmasking effects and spatial unmasking effects. Recognition performance (i.e., intelligibility threshold) was determined by fitting psychometric functions to the recognition accuracies across four signal-to-noise ratios (-14, -10, -6, and - 2 dB) to evaluate each subject's ability to cope with challenging listening conditions.METHODThirty-six participants were instructed to recognize target speech when it was masked by competing speech varied in masking types across syllables, words, and sentence. The perceived spatial location was altered to examine the interaction between linguistic unmasking effects and spatial unmasking effects. Recognition performance (i.e., intelligibility threshold) was determined by fitting psychometric functions to the recognition accuracies across four signal-to-noise ratios (-14, -10, -6, and - 2 dB) to evaluate each subject's ability to cope with challenging listening conditions.We revealed a significant decline in target speech recognition when the masking speech was linguistically structured and intelligible. Specifically, masking speech with higher linguistic complexity, such as coherent sentences, resulted in more significant interference compared to those with lower complexity, like sequences of syllables. The linguistic release from masking, resulting from a decrease in linguistic complexity of maskers shifting from sentences to syllables, was found to be correlated with, and also linearly additive to, the spatial release from masking due to the spatial separation of the masker and target.RESULTSWe revealed a significant decline in target speech recognition when the masking speech was linguistically structured and intelligible. Specifically, masking speech with higher linguistic complexity, such as coherent sentences, resulted in more significant interference compared to those with lower complexity, like sequences of syllables. The linguistic release from masking, resulting from a decrease in linguistic complexity of maskers shifting from sentences to syllables, was found to be correlated with, and also linearly additive to, the spatial release from masking due to the spatial separation of the masker and target.These findings illustrate the influence of linguistic complexity in masking speech on the recognition of target speech, suggesting the involvement of higher-level linguistic processing of irrelevant speech in noisy environment.CONCLUSIONSThese findings illustrate the influence of linguistic complexity in masking speech on the recognition of target speech, suggesting the involvement of higher-level linguistic processing of irrelevant speech in noisy environment. In the context of a cocktail party listening environment, the processing of different linguistic hierarchy levels in unattended speech and their influence on target speech recognition remain controversial. This study aims to investigate how different levels of linguistic structures (such as syllable, word, and sentence) in competing speech influence the recognition of target speech in a speech-to-speech masking situation. Thirty-six participants were instructed to recognize target speech when it was masked by competing speech varied in masking types across syllables, words, and sentence. The perceived spatial location was altered to examine the interaction between linguistic unmasking effects and spatial unmasking effects. Recognition performance (i.e., intelligibility threshold) was determined by fitting psychometric functions to the recognition accuracies across four signal-to-noise ratios (-14, -10, -6, and - 2 dB) to evaluate each subject's ability to cope with challenging listening conditions. We revealed a significant decline in target speech recognition when the masking speech was linguistically structured and intelligible. Specifically, masking speech with higher linguistic complexity, such as coherent sentences, resulted in more significant interference compared to those with lower complexity, like sequences of syllables. The linguistic release from masking, resulting from a decrease in linguistic complexity of maskers shifting from sentences to syllables, was found to be correlated with, and also linearly additive to, the spatial release from masking due to the spatial separation of the masker and target. These findings illustrate the influence of linguistic complexity in masking speech on the recognition of target speech, suggesting the involvement of higher-level linguistic processing of irrelevant speech in noisy environment.
Author	Chen, Liangjie Lu, Lingxi Wu, Danni Zhang, Xiaoyu
Author_xml	– sequence: 1 givenname: Lingxi orcidid: 0000-0001-5090-4663 surname: Lu fullname: Lu, Lingxi – sequence: 2 givenname: Danni surname: Wu fullname: Wu, Danni – sequence: 3 givenname: Xiaoyu surname: Zhang fullname: Zhang, Xiaoyu – sequence: 4 givenname: Liangjie surname: Chen fullname: Chen, Liangjie
BackLink	https://www.ncbi.nlm.nih.gov/pubmed/39626028$$D View this record in MEDLINE/PubMed
BookMark	eNpFkE1PwzAMhiM0xD7gFyChHrkU7CRNuiOagIEqMbHtXKWpC4WtHUl32L8nYwN8sfX6sQ_PkPWatiHGLhFuEKS85cBl_jzPpq8xlzGABHnCBpgkaTxG4L0ww5jHUqRpnw29_4BQKNUZ64ux4gp4OmCLrG7etrXvahvNXGvJ-xBEbRUtG9N11JRURvMNkX0PSUkuMtGktZ-dqVfRzLhuF2Xhmpr91dxSY1zdnrPTyqw8XRz7iC0f7heTaZy9PD5N7rLYooYuLhNNRmkJmoAKidIohWPkimNJlbaVAVUk3Agb9qjRorFCWU1CS0QoxIhdH_5uXPu1Jd_l69pbWq1MQ-3W5wJlMJCAFgG9OqLbYk1lvnH12rhd_msiAOIAWNd676j6QxDyve_833cehh_f4hu3jnJg
Cites_doi	10.1121/1.4928954 10.3766/jaaa.18.7.4 10.1523/JNEUROSCI.2606-17.2017 10.1037/0096-1523.29.1.172 10.1037/0096-1523.30.6.1077 10.1121/1.3693656 10.3758/s13414-020-02149-1 10.1073/pnas.1205381109 10.1016/j.specom.2007.05.008 10.1121/1.3479547 10.1038/nature11020 10.1121/1.1354984 10.1126/science.182.4108.177 10.3389/fnins.2021.643705 10.1523/ENEURO.0346-22.2023 10.1121/1.2945710 10.1016/j.crneur.2022.100043 10.1523/JNEUROSCI.1731-22.2023 10.1121/1.3458857 10.1121/1.1907229 10.7554/eLife.65096 10.1016/j.neuron.2012.12.037 10.1523/JNEUROSCI.3631-09.2010 10.1093/cercor/bhac424 10.1016/j.specom.2007.05.005 10.1017/S0033291715001828 10.1002/pchj.622 10.1121/1.428211 10.1121/1.4954748 10.1523/JNEUROSCI.3675-12.2013 10.1121/1.1510141 10.1016/j.cub.2018.10.042 10.1371/journal.pbio.3000883 10.1162/jocn_a_01303 10.1121/1.2804952 10.1093/cercor/bhy191 10.3389/fnhum.2016.00538 10.1044/1092-4388(2011/10-0282) 10.3758/s13414-018-1489-8 10.1044/2017_JSLHR-H-17-0215 10.1121/1.1917119 10.1038/s41593-020-0639-1 10.1073/pnas.90.18.8722 10.1037/xlm0000874
ContentType	Journal Article
DBID	AAYXX CITATION CGR CUY CVF ECM EIF NPM 7X8
DOI	10.1044/2024_JSLHR-24-00404
DatabaseName	CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed MEDLINE - Academic
DatabaseTitle	CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic
DatabaseTitleList	MEDLINE - Academic MEDLINE
Database_xml	– sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database
DeliveryMethod	fulltext_linktorsrc
Discipline	Medicine Languages & Literatures Social Welfare & Social Work
EISSN	1558-9102
EndPage	78
ExternalDocumentID	39626028 10_1044_2024_JSLHR_24_00404
Genre	Journal Article
GroupedDBID	--- --Z -W8 -~X .GJ .GO 0-V 04C 0R~ 18M 29L 36B 4.4 5GY 6NX 6PF 7RV 7X7 85S 8A4 8G5 8R4 8R5 AAHSB AAWTL AAYXX ABDBF ABIVO ABOPQ ABPPZ ABWJO ABZEH ACGFO ACGOD ACHQT ACNCT ACUHS ACUXI ADBBV ADOJX AENEX AERSA AFKRA AGHSJ AHMBA AIKWM ALIPV ALMA_UNASSIGNED_HOLDINGS ALSLI ARALO AZQEC BENPR BKEYQ BMSDO BPHCQ BVXVI CITATION CJNVE CPGLG CRLPW CS3 DU5 EAD EAP EAS EBD EBO EBS ECE ECF ECT EDJ EIHBH EMB EMK EMOBN ESX EX3 F5P F9R FJW FYUFA GUQSH H13 HCIFZ HZ~ I-F IAO ICO IEA IER IHR IHW IN- INH INIJC INR IOF IPO IPY M0P M1P M2M M2O M2P M2Q M2R MLAFT O9- P2P PADUT PCD PQQKQ PROAC PSQYO PSYQQ Q2X QF4 QM7 QN7 QO5 RWL S0X SJA SV3 TAE TH9 TN5 TUS TWZ UHB UKHRP UPT WH7 WOW WQ9 YQT ZCA ABDSA CGR CUY CVF ECM EIF NPM YCJ 7X8
ID	FETCH-LOGICAL-c170t-d57ea67407e0eb414a661912621def7cfa06b52a3ce0e171c1ac36c7e374110b3
ISSN	1092-4388 1558-9102
IngestDate	Fri Jul 11 04:26:41 EDT 2025 Thu Apr 03 07:02:36 EDT 2025 Tue Jul 01 01:22:40 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Issue	1
Language	English
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-c170t-d57ea67407e0eb414a661912621def7cfa06b52a3ce0e171c1ac36c7e374110b3
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ORCID	0000-0001-5090-4663
PMID	39626028
PQID	3140925073
PQPubID	23479
PageCount	10
ParticipantIDs	proquest_miscellaneous_3140925073 pubmed_primary_39626028 crossref_primary_10_1044_2024_JSLHR_24_00404
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2025-Jan-02
PublicationDateYYYYMMDD	2025-01-02
PublicationDate_xml	– month: 01 year: 2025 text: 2025-Jan-02 day: 02
PublicationDecade	2020
PublicationPlace	United States
PublicationPlace_xml	– name: United States
PublicationTitle	Journal of speech, language, and hearing research
PublicationTitleAlternate	J Speech Lang Hear Res
PublicationYear	2025
References	e_1_3_2_27_1 e_1_3_2_28_1 e_1_3_2_29_1 e_1_3_2_42_1 e_1_3_2_20_1 e_1_3_2_41_1 e_1_3_2_21_1 e_1_3_2_22_1 e_1_3_2_43_1 e_1_3_2_23_1 e_1_3_2_46_1 e_1_3_2_24_1 e_1_3_2_45_1 e_1_3_2_25_1 e_1_3_2_26_1 e_1_3_2_47_1 Devlin J. (e_1_3_2_11_1) 2019 Yu S. (e_1_3_2_44_1) 2003; 13 e_1_3_2_40_1 e_1_3_2_16_1 e_1_3_2_39_1 e_1_3_2_9_1 e_1_3_2_17_1 e_1_3_2_38_1 e_1_3_2_8_1 e_1_3_2_18_1 e_1_3_2_7_1 e_1_3_2_19_1 e_1_3_2_2_1 e_1_3_2_31_1 e_1_3_2_30_1 e_1_3_2_10_1 e_1_3_2_33_1 e_1_3_2_32_1 e_1_3_2_6_1 e_1_3_2_12_1 e_1_3_2_35_1 e_1_3_2_5_1 e_1_3_2_13_1 e_1_3_2_34_1 e_1_3_2_4_1 e_1_3_2_14_1 e_1_3_2_37_1 e_1_3_2_3_1 e_1_3_2_15_1 e_1_3_2_36_1
References_xml	– ident: e_1_3_2_15_1 doi: 10.1121/1.4928954 – ident: e_1_3_2_36_1 doi: 10.3766/jaaa.18.7.4 – ident: e_1_3_2_12_1 doi: 10.1523/JNEUROSCI.2606-17.2017 – ident: e_1_3_2_14_1 doi: 10.1037/0096-1523.29.1.172 – volume: 13 start-page: 121 year: 2003 ident: e_1_3_2_44_1 article-title: Specification for corpus processing at Peking University: Word segmentation, POS tagging and phonetic notation publication-title: Journal of Chinese Language and Computing – ident: e_1_3_2_27_1 doi: 10.1037/0096-1523.30.6.1077 – ident: e_1_3_2_3_1 doi: 10.1121/1.3693656 – start-page: 4171 volume-title: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) year: 2019 ident: e_1_3_2_11_1 – ident: e_1_3_2_30_1 doi: 10.3758/s13414-020-02149-1 – ident: e_1_3_2_13_1 doi: 10.1073/pnas.1205381109 – ident: e_1_3_2_21_1 doi: 10.1016/j.specom.2007.05.008 – ident: e_1_3_2_23_1 doi: 10.1121/1.3479547 – ident: e_1_3_2_32_1 doi: 10.1038/nature11020 – ident: e_1_3_2_16_1 doi: 10.1121/1.1354984 – ident: e_1_3_2_20_1 doi: 10.1126/science.182.4108.177 – ident: e_1_3_2_22_1 doi: 10.3389/fnins.2021.643705 – ident: e_1_3_2_29_1 doi: 10.1523/ENEURO.0346-22.2023 – ident: e_1_3_2_31_1 doi: 10.1121/1.2945710 – ident: e_1_3_2_10_1 doi: 10.1016/j.crneur.2022.100043 – ident: e_1_3_2_6_1 doi: 10.1523/JNEUROSCI.1731-22.2023 – ident: e_1_3_2_7_1 doi: 10.1121/1.3458857 – ident: e_1_3_2_8_1 doi: 10.1121/1.1907229 – ident: e_1_3_2_19_1 doi: 10.7554/eLife.65096 – ident: e_1_3_2_47_1 doi: 10.1016/j.neuron.2012.12.037 – ident: e_1_3_2_24_1 doi: 10.1523/JNEUROSCI.3631-09.2010 – ident: e_1_3_2_34_1 doi: 10.1093/cercor/bhac424 – ident: e_1_3_2_43_1 doi: 10.1016/j.specom.2007.05.005 – ident: e_1_3_2_45_1 doi: 10.1017/S0033291715001828 – ident: e_1_3_2_26_1 doi: 10.1002/pchj.622 – ident: e_1_3_2_17_1 doi: 10.1121/1.428211 – ident: e_1_3_2_25_1 doi: 10.1121/1.4954748 – ident: e_1_3_2_46_1 doi: 10.1523/JNEUROSCI.3675-12.2013 – ident: e_1_3_2_2_1 doi: 10.1121/1.1510141 – ident: e_1_3_2_4_1 doi: 10.1016/j.cub.2018.10.042 – ident: e_1_3_2_5_1 doi: 10.1371/journal.pbio.3000883 – ident: e_1_3_2_33_1 doi: 10.1162/jocn_a_01303 – ident: e_1_3_2_9_1 doi: 10.1121/1.2804952 – ident: e_1_3_2_37_1 doi: 10.1093/cercor/bhy191 – ident: e_1_3_2_41_1 doi: 10.3389/fnhum.2016.00538 – ident: e_1_3_2_42_1 doi: 10.1044/1092-4388(2011/10-0282) – ident: e_1_3_2_28_1 doi: 10.3758/s13414-018-1489-8 – ident: e_1_3_2_38_1 doi: 10.1044/2017_JSLHR-H-17-0215 – ident: e_1_3_2_39_1 doi: 10.1121/1.1917119 – ident: e_1_3_2_18_1 doi: 10.1038/s41593-020-0639-1 – ident: e_1_3_2_40_1 doi: 10.1073/pnas.90.18.8722 – ident: e_1_3_2_35_1 doi: 10.1037/xlm0000874
SSID	ssj0000146
Score	2.4419496
Snippet	In the context of a cocktail party listening environment, the processing of different linguistic hierarchy levels in unattended speech and their influence on...
SourceID	proquest pubmed crossref
SourceType	Aggregation Database Index Database
StartPage	69
SubjectTerms	Adult Female Humans Linguistics Male Noise Perceptual Masking - physiology Signal-To-Noise Ratio Speech Intelligibility - physiology Speech Perception - physiology Young Adult
Title	Linguistic Processing of Unattended Speech Under a Cocktail Party Listening Scenario
URI	https://www.ncbi.nlm.nih.gov/pubmed/39626028 https://www.proquest.com/docview/3140925073
Volume	68
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9NAEF6FVkJcKghQAgUtEuJSXOz1ejc-tkAVVS4HmojcrPVmHUKRXSWxBPwWfiyzLzulLaJcLGsTTR7f59mZ2Xkg9IrIkhNS8IDNEhXQVMggFUkYyCEYD1yylJraqtOPbDShJ9Nk2uv92shaatbFgfx5bV3J_6AKa4CrrpK9BbKtUFiAe8AXroAwXP8JY3Ak543ptOwT_l0O86TSbTN1cFsPmFfyy76Zb7Qv4PmX5zppFEzHJdjfmUbZhEbOpKrAb65vsFZXRowGxEc4fd6nHomtBbi2QW14OWuczz__vmhVf-PK2qt2qY1YTxei_tF06QZWH2ZA3_nXhdqMTpDERCc2ApZRmOqqLDu670A5JZtoJRte0sJseIVtVqXaSS5uc7bjfq6o_ZBSXdkC9kZ-cpaNPgWEBlo90W6X8yf7f2x-bUqiOYynNO-E5HBjhNxB2wScEFD724dH74-ON9qT2eo1_xt9VytK317zXS5bPje4M8asGd9HOw5hfGjJ9QD1VNVHu5nDeIVf46xtvL3qo7unLhejj_ZsZTf-rL6VYqngnX6hXp4_ROOOnLgjJ65L3JETW3JiQ04ssCcnNuTELTmxJ-cjNDn-MH43CtwIj0BGPFwHs4QrwTgNuQpVQSMqwB5MI8JINFMll6UIWZEQEUt4PeKRjISMmeQqBks3Cov4Mdqq6ko9QZiWJGKqmMWF4HQoJfjFYkaGMqayLBgTA_TG_735he3Ukv8F1AF66SHIQaPqYzJRqbpZ5bHuAQeeAY8HaNdi0wqMUx0AIMOnt_uwZ-he93Dsoa31slHPwZhdFy8cp34DTRWa1w
linkProvider	EBSCOhost
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Linguistic+Processing+of+Unattended+Speech+Under+a+Cocktail+Party+Listening+Scenario&rft.jtitle=Journal+of+speech%2C+language%2C+and+hearing+research&rft.au=Lu%2C+Lingxi&rft.au=Wu%2C+Danni&rft.au=Zhang%2C+Xiaoyu&rft.au=Chen%2C+Liangjie&rft.date=2025-01-02&rft.issn=1092-4388&rft.eissn=1558-9102&rft.volume=68&rft.issue=1&rft.spage=69&rft.epage=78&rft_id=info:doi/10.1044%2F2024_JSLHR-24-00404&rft.externalDBID=n%2Fa&rft.externalDocID=10_1044_2024_JSLHR_24_00404
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1092-4388&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1092-4388&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1092-4388&client=summon