Linguistic Processing of Unattended Speech Under a Cocktail Party Listening Scenario

In the context of a cocktail party listening environment, the processing of different linguistic hierarchy levels in unattended speech and their influence on target speech recognition remain controversial. This study aims to investigate how different levels of linguistic structures (such as syllable...

Full description

Saved in:
Bibliographic Details
Published inJournal of speech, language, and hearing research Vol. 68; no. 1; pp. 69 - 78
Main Authors Lu, Lingxi, Wu, Danni, Zhang, Xiaoyu, Chen, Liangjie
Format Journal Article
LanguageEnglish
Published United States 02.01.2025
Subjects
Online AccessGet full text
ISSN1092-4388
1558-9102
1558-9102
DOI10.1044/2024_JSLHR-24-00404

Cover

Loading…
Abstract In the context of a cocktail party listening environment, the processing of different linguistic hierarchy levels in unattended speech and their influence on target speech recognition remain controversial. This study aims to investigate how different levels of linguistic structures (such as syllable, word, and sentence) in competing speech influence the recognition of target speech in a speech-to-speech masking situation. Thirty-six participants were instructed to recognize target speech when it was masked by competing speech varied in masking types across syllables, words, and sentence. The perceived spatial location was altered to examine the interaction between linguistic unmasking effects and spatial unmasking effects. Recognition performance (i.e., intelligibility threshold) was determined by fitting psychometric functions to the recognition accuracies across four signal-to-noise ratios (-14, -10, -6, and - 2 dB) to evaluate each subject's ability to cope with challenging listening conditions. We revealed a significant decline in target speech recognition when the masking speech was linguistically structured and intelligible. Specifically, masking speech with higher linguistic complexity, such as coherent sentences, resulted in more significant interference compared to those with lower complexity, like sequences of syllables. The linguistic release from masking, resulting from a decrease in linguistic complexity of maskers shifting from sentences to syllables, was found to be correlated with, and also linearly additive to, the spatial release from masking due to the spatial separation of the masker and target. These findings illustrate the influence of linguistic complexity in masking speech on the recognition of target speech, suggesting the involvement of higher-level linguistic processing of irrelevant speech in noisy environment.
AbstractList In the context of a cocktail party listening environment, the processing of different linguistic hierarchy levels in unattended speech and their influence on target speech recognition remain controversial. This study aims to investigate how different levels of linguistic structures (such as syllable, word, and sentence) in competing speech influence the recognition of target speech in a speech-to-speech masking situation.PURPOSEIn the context of a cocktail party listening environment, the processing of different linguistic hierarchy levels in unattended speech and their influence on target speech recognition remain controversial. This study aims to investigate how different levels of linguistic structures (such as syllable, word, and sentence) in competing speech influence the recognition of target speech in a speech-to-speech masking situation.Thirty-six participants were instructed to recognize target speech when it was masked by competing speech varied in masking types across syllables, words, and sentence. The perceived spatial location was altered to examine the interaction between linguistic unmasking effects and spatial unmasking effects. Recognition performance (i.e., intelligibility threshold) was determined by fitting psychometric functions to the recognition accuracies across four signal-to-noise ratios (-14, -10, -6, and - 2 dB) to evaluate each subject's ability to cope with challenging listening conditions.METHODThirty-six participants were instructed to recognize target speech when it was masked by competing speech varied in masking types across syllables, words, and sentence. The perceived spatial location was altered to examine the interaction between linguistic unmasking effects and spatial unmasking effects. Recognition performance (i.e., intelligibility threshold) was determined by fitting psychometric functions to the recognition accuracies across four signal-to-noise ratios (-14, -10, -6, and - 2 dB) to evaluate each subject's ability to cope with challenging listening conditions.We revealed a significant decline in target speech recognition when the masking speech was linguistically structured and intelligible. Specifically, masking speech with higher linguistic complexity, such as coherent sentences, resulted in more significant interference compared to those with lower complexity, like sequences of syllables. The linguistic release from masking, resulting from a decrease in linguistic complexity of maskers shifting from sentences to syllables, was found to be correlated with, and also linearly additive to, the spatial release from masking due to the spatial separation of the masker and target.RESULTSWe revealed a significant decline in target speech recognition when the masking speech was linguistically structured and intelligible. Specifically, masking speech with higher linguistic complexity, such as coherent sentences, resulted in more significant interference compared to those with lower complexity, like sequences of syllables. The linguistic release from masking, resulting from a decrease in linguistic complexity of maskers shifting from sentences to syllables, was found to be correlated with, and also linearly additive to, the spatial release from masking due to the spatial separation of the masker and target.These findings illustrate the influence of linguistic complexity in masking speech on the recognition of target speech, suggesting the involvement of higher-level linguistic processing of irrelevant speech in noisy environment.CONCLUSIONSThese findings illustrate the influence of linguistic complexity in masking speech on the recognition of target speech, suggesting the involvement of higher-level linguistic processing of irrelevant speech in noisy environment.
In the context of a cocktail party listening environment, the processing of different linguistic hierarchy levels in unattended speech and their influence on target speech recognition remain controversial. This study aims to investigate how different levels of linguistic structures (such as syllable, word, and sentence) in competing speech influence the recognition of target speech in a speech-to-speech masking situation. Thirty-six participants were instructed to recognize target speech when it was masked by competing speech varied in masking types across syllables, words, and sentence. The perceived spatial location was altered to examine the interaction between linguistic unmasking effects and spatial unmasking effects. Recognition performance (i.e., intelligibility threshold) was determined by fitting psychometric functions to the recognition accuracies across four signal-to-noise ratios (-14, -10, -6, and - 2 dB) to evaluate each subject's ability to cope with challenging listening conditions. We revealed a significant decline in target speech recognition when the masking speech was linguistically structured and intelligible. Specifically, masking speech with higher linguistic complexity, such as coherent sentences, resulted in more significant interference compared to those with lower complexity, like sequences of syllables. The linguistic release from masking, resulting from a decrease in linguistic complexity of maskers shifting from sentences to syllables, was found to be correlated with, and also linearly additive to, the spatial release from masking due to the spatial separation of the masker and target. These findings illustrate the influence of linguistic complexity in masking speech on the recognition of target speech, suggesting the involvement of higher-level linguistic processing of irrelevant speech in noisy environment.
Author Chen, Liangjie
Lu, Lingxi
Wu, Danni
Zhang, Xiaoyu
Author_xml – sequence: 1
  givenname: Lingxi
  orcidid: 0000-0001-5090-4663
  surname: Lu
  fullname: Lu, Lingxi
– sequence: 2
  givenname: Danni
  surname: Wu
  fullname: Wu, Danni
– sequence: 3
  givenname: Xiaoyu
  surname: Zhang
  fullname: Zhang, Xiaoyu
– sequence: 4
  givenname: Liangjie
  surname: Chen
  fullname: Chen, Liangjie
BackLink https://www.ncbi.nlm.nih.gov/pubmed/39626028$$D View this record in MEDLINE/PubMed
BookMark eNpFkE1PwzAMhiM0xD7gFyChHrkU7CRNuiOagIEqMbHtXKWpC4WtHUl32L8nYwN8sfX6sQ_PkPWatiHGLhFuEKS85cBl_jzPpq8xlzGABHnCBpgkaTxG4L0ww5jHUqRpnw29_4BQKNUZ64ux4gp4OmCLrG7etrXvahvNXGvJ-xBEbRUtG9N11JRURvMNkX0PSUkuMtGktZ-dqVfRzLhuF2Xhmpr91dxSY1zdnrPTyqw8XRz7iC0f7heTaZy9PD5N7rLYooYuLhNNRmkJmoAKidIohWPkimNJlbaVAVUk3Agb9qjRorFCWU1CS0QoxIhdH_5uXPu1Jd_l69pbWq1MQ-3W5wJlMJCAFgG9OqLbYk1lvnH12rhd_msiAOIAWNd676j6QxDyve_833cehh_f4hu3jnJg
Cites_doi 10.1121/1.4928954
10.3766/jaaa.18.7.4
10.1523/JNEUROSCI.2606-17.2017
10.1037/0096-1523.29.1.172
10.1037/0096-1523.30.6.1077
10.1121/1.3693656
10.3758/s13414-020-02149-1
10.1073/pnas.1205381109
10.1016/j.specom.2007.05.008
10.1121/1.3479547
10.1038/nature11020
10.1121/1.1354984
10.1126/science.182.4108.177
10.3389/fnins.2021.643705
10.1523/ENEURO.0346-22.2023
10.1121/1.2945710
10.1016/j.crneur.2022.100043
10.1523/JNEUROSCI.1731-22.2023
10.1121/1.3458857
10.1121/1.1907229
10.7554/eLife.65096
10.1016/j.neuron.2012.12.037
10.1523/JNEUROSCI.3631-09.2010
10.1093/cercor/bhac424
10.1016/j.specom.2007.05.005
10.1017/S0033291715001828
10.1002/pchj.622
10.1121/1.428211
10.1121/1.4954748
10.1523/JNEUROSCI.3675-12.2013
10.1121/1.1510141
10.1016/j.cub.2018.10.042
10.1371/journal.pbio.3000883
10.1162/jocn_a_01303
10.1121/1.2804952
10.1093/cercor/bhy191
10.3389/fnhum.2016.00538
10.1044/1092-4388(2011/10-0282)
10.3758/s13414-018-1489-8
10.1044/2017_JSLHR-H-17-0215
10.1121/1.1917119
10.1038/s41593-020-0639-1
10.1073/pnas.90.18.8722
10.1037/xlm0000874
ContentType Journal Article
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
7X8
DOI 10.1044/2024_JSLHR-24-00404
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic
MEDLINE
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Medicine
Languages & Literatures
Social Welfare & Social Work
EISSN 1558-9102
EndPage 78
ExternalDocumentID 39626028
10_1044_2024_JSLHR_24_00404
Genre Journal Article
GroupedDBID ---
--Z
-W8
-~X
.GJ
.GO
0-V
04C
0R~
18M
29L
36B
4.4
5GY
6NX
6PF
7RV
7X7
85S
8A4
8G5
8R4
8R5
AAHSB
AAWTL
AAYXX
ABDBF
ABIVO
ABOPQ
ABPPZ
ABWJO
ABZEH
ACGFO
ACGOD
ACHQT
ACNCT
ACUHS
ACUXI
ADBBV
ADOJX
AENEX
AERSA
AFKRA
AGHSJ
AHMBA
AIKWM
ALIPV
ALMA_UNASSIGNED_HOLDINGS
ALSLI
ARALO
AZQEC
BENPR
BKEYQ
BMSDO
BPHCQ
BVXVI
CITATION
CJNVE
CPGLG
CRLPW
CS3
DU5
EAD
EAP
EAS
EBD
EBO
EBS
ECE
ECF
ECT
EDJ
EIHBH
EMB
EMK
EMOBN
ESX
EX3
F5P
F9R
FJW
FYUFA
GUQSH
H13
HCIFZ
HZ~
I-F
IAO
ICO
IEA
IER
IHR
IHW
IN-
INH
INIJC
INR
IOF
IPO
IPY
M0P
M1P
M2M
M2O
M2P
M2Q
M2R
MLAFT
O9-
P2P
PADUT
PCD
PQQKQ
PROAC
PSQYO
PSYQQ
Q2X
QF4
QM7
QN7
QO5
RWL
S0X
SJA
SV3
TAE
TH9
TN5
TUS
TWZ
UHB
UKHRP
UPT
WH7
WOW
WQ9
YQT
ZCA
ABDSA
CGR
CUY
CVF
ECM
EIF
NPM
YCJ
7X8
ID FETCH-LOGICAL-c170t-d57ea67407e0eb414a661912621def7cfa06b52a3ce0e171c1ac36c7e374110b3
ISSN 1092-4388
1558-9102
IngestDate Fri Jul 11 04:26:41 EDT 2025
Thu Apr 03 07:02:36 EDT 2025
Tue Jul 01 01:22:40 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c170t-d57ea67407e0eb414a661912621def7cfa06b52a3ce0e171c1ac36c7e374110b3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ORCID 0000-0001-5090-4663
PMID 39626028
PQID 3140925073
PQPubID 23479
PageCount 10
ParticipantIDs proquest_miscellaneous_3140925073
pubmed_primary_39626028
crossref_primary_10_1044_2024_JSLHR_24_00404
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2025-Jan-02
PublicationDateYYYYMMDD 2025-01-02
PublicationDate_xml – month: 01
  year: 2025
  text: 2025-Jan-02
  day: 02
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle Journal of speech, language, and hearing research
PublicationTitleAlternate J Speech Lang Hear Res
PublicationYear 2025
References e_1_3_2_27_1
e_1_3_2_28_1
e_1_3_2_29_1
e_1_3_2_42_1
e_1_3_2_20_1
e_1_3_2_41_1
e_1_3_2_21_1
e_1_3_2_22_1
e_1_3_2_43_1
e_1_3_2_23_1
e_1_3_2_46_1
e_1_3_2_24_1
e_1_3_2_45_1
e_1_3_2_25_1
e_1_3_2_26_1
e_1_3_2_47_1
Devlin J. (e_1_3_2_11_1) 2019
Yu S. (e_1_3_2_44_1) 2003; 13
e_1_3_2_40_1
e_1_3_2_16_1
e_1_3_2_39_1
e_1_3_2_9_1
e_1_3_2_17_1
e_1_3_2_38_1
e_1_3_2_8_1
e_1_3_2_18_1
e_1_3_2_7_1
e_1_3_2_19_1
e_1_3_2_2_1
e_1_3_2_31_1
e_1_3_2_30_1
e_1_3_2_10_1
e_1_3_2_33_1
e_1_3_2_32_1
e_1_3_2_6_1
e_1_3_2_12_1
e_1_3_2_35_1
e_1_3_2_5_1
e_1_3_2_13_1
e_1_3_2_34_1
e_1_3_2_4_1
e_1_3_2_14_1
e_1_3_2_37_1
e_1_3_2_3_1
e_1_3_2_15_1
e_1_3_2_36_1
References_xml – ident: e_1_3_2_15_1
  doi: 10.1121/1.4928954
– ident: e_1_3_2_36_1
  doi: 10.3766/jaaa.18.7.4
– ident: e_1_3_2_12_1
  doi: 10.1523/JNEUROSCI.2606-17.2017
– ident: e_1_3_2_14_1
  doi: 10.1037/0096-1523.29.1.172
– volume: 13
  start-page: 121
  year: 2003
  ident: e_1_3_2_44_1
  article-title: Specification for corpus processing at Peking University: Word segmentation, POS tagging and phonetic notation
  publication-title: Journal of Chinese Language and Computing
– ident: e_1_3_2_27_1
  doi: 10.1037/0096-1523.30.6.1077
– ident: e_1_3_2_3_1
  doi: 10.1121/1.3693656
– start-page: 4171
  volume-title: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
  year: 2019
  ident: e_1_3_2_11_1
– ident: e_1_3_2_30_1
  doi: 10.3758/s13414-020-02149-1
– ident: e_1_3_2_13_1
  doi: 10.1073/pnas.1205381109
– ident: e_1_3_2_21_1
  doi: 10.1016/j.specom.2007.05.008
– ident: e_1_3_2_23_1
  doi: 10.1121/1.3479547
– ident: e_1_3_2_32_1
  doi: 10.1038/nature11020
– ident: e_1_3_2_16_1
  doi: 10.1121/1.1354984
– ident: e_1_3_2_20_1
  doi: 10.1126/science.182.4108.177
– ident: e_1_3_2_22_1
  doi: 10.3389/fnins.2021.643705
– ident: e_1_3_2_29_1
  doi: 10.1523/ENEURO.0346-22.2023
– ident: e_1_3_2_31_1
  doi: 10.1121/1.2945710
– ident: e_1_3_2_10_1
  doi: 10.1016/j.crneur.2022.100043
– ident: e_1_3_2_6_1
  doi: 10.1523/JNEUROSCI.1731-22.2023
– ident: e_1_3_2_7_1
  doi: 10.1121/1.3458857
– ident: e_1_3_2_8_1
  doi: 10.1121/1.1907229
– ident: e_1_3_2_19_1
  doi: 10.7554/eLife.65096
– ident: e_1_3_2_47_1
  doi: 10.1016/j.neuron.2012.12.037
– ident: e_1_3_2_24_1
  doi: 10.1523/JNEUROSCI.3631-09.2010
– ident: e_1_3_2_34_1
  doi: 10.1093/cercor/bhac424
– ident: e_1_3_2_43_1
  doi: 10.1016/j.specom.2007.05.005
– ident: e_1_3_2_45_1
  doi: 10.1017/S0033291715001828
– ident: e_1_3_2_26_1
  doi: 10.1002/pchj.622
– ident: e_1_3_2_17_1
  doi: 10.1121/1.428211
– ident: e_1_3_2_25_1
  doi: 10.1121/1.4954748
– ident: e_1_3_2_46_1
  doi: 10.1523/JNEUROSCI.3675-12.2013
– ident: e_1_3_2_2_1
  doi: 10.1121/1.1510141
– ident: e_1_3_2_4_1
  doi: 10.1016/j.cub.2018.10.042
– ident: e_1_3_2_5_1
  doi: 10.1371/journal.pbio.3000883
– ident: e_1_3_2_33_1
  doi: 10.1162/jocn_a_01303
– ident: e_1_3_2_9_1
  doi: 10.1121/1.2804952
– ident: e_1_3_2_37_1
  doi: 10.1093/cercor/bhy191
– ident: e_1_3_2_41_1
  doi: 10.3389/fnhum.2016.00538
– ident: e_1_3_2_42_1
  doi: 10.1044/1092-4388(2011/10-0282)
– ident: e_1_3_2_28_1
  doi: 10.3758/s13414-018-1489-8
– ident: e_1_3_2_38_1
  doi: 10.1044/2017_JSLHR-H-17-0215
– ident: e_1_3_2_39_1
  doi: 10.1121/1.1917119
– ident: e_1_3_2_18_1
  doi: 10.1038/s41593-020-0639-1
– ident: e_1_3_2_40_1
  doi: 10.1073/pnas.90.18.8722
– ident: e_1_3_2_35_1
  doi: 10.1037/xlm0000874
SSID ssj0000146
Score 2.4419496
Snippet In the context of a cocktail party listening environment, the processing of different linguistic hierarchy levels in unattended speech and their influence on...
SourceID proquest
pubmed
crossref
SourceType Aggregation Database
Index Database
StartPage 69
SubjectTerms Adult
Female
Humans
Linguistics
Male
Noise
Perceptual Masking - physiology
Signal-To-Noise Ratio
Speech Intelligibility - physiology
Speech Perception - physiology
Young Adult
Title Linguistic Processing of Unattended Speech Under a Cocktail Party Listening Scenario
URI https://www.ncbi.nlm.nih.gov/pubmed/39626028
https://www.proquest.com/docview/3140925073
Volume 68
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9NAEF6FVkJcKghQAgUtEuJSXOz1ejc-tkAVVS4HmojcrPVmHUKRXSWxBPwWfiyzLzulLaJcLGsTTR7f59mZ2Xkg9IrIkhNS8IDNEhXQVMggFUkYyCEYD1yylJraqtOPbDShJ9Nk2uv92shaatbFgfx5bV3J_6AKa4CrrpK9BbKtUFiAe8AXroAwXP8JY3Ak543ptOwT_l0O86TSbTN1cFsPmFfyy76Zb7Qv4PmX5zppFEzHJdjfmUbZhEbOpKrAb65vsFZXRowGxEc4fd6nHomtBbi2QW14OWuczz__vmhVf-PK2qt2qY1YTxei_tF06QZWH2ZA3_nXhdqMTpDERCc2ApZRmOqqLDu670A5JZtoJRte0sJseIVtVqXaSS5uc7bjfq6o_ZBSXdkC9kZ-cpaNPgWEBlo90W6X8yf7f2x-bUqiOYynNO-E5HBjhNxB2wScEFD724dH74-ON9qT2eo1_xt9VytK317zXS5bPje4M8asGd9HOw5hfGjJ9QD1VNVHu5nDeIVf46xtvL3qo7unLhejj_ZsZTf-rL6VYqngnX6hXp4_ROOOnLgjJ65L3JETW3JiQ04ssCcnNuTELTmxJ-cjNDn-MH43CtwIj0BGPFwHs4QrwTgNuQpVQSMqwB5MI8JINFMll6UIWZEQEUt4PeKRjISMmeQqBks3Cov4Mdqq6ko9QZiWJGKqmMWF4HQoJfjFYkaGMqayLBgTA_TG_735he3Ukv8F1AF66SHIQaPqYzJRqbpZ5bHuAQeeAY8HaNdi0wqMUx0AIMOnt_uwZ-he93Dsoa31slHPwZhdFy8cp34DTRWa1w
linkProvider EBSCOhost
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Linguistic+Processing+of+Unattended+Speech+Under+a+Cocktail+Party+Listening+Scenario&rft.jtitle=Journal+of+speech%2C+language%2C+and+hearing+research&rft.au=Lu%2C+Lingxi&rft.au=Wu%2C+Danni&rft.au=Zhang%2C+Xiaoyu&rft.au=Chen%2C+Liangjie&rft.date=2025-01-02&rft.issn=1092-4388&rft.eissn=1558-9102&rft.volume=68&rft.issue=1&rft.spage=69&rft.epage=78&rft_id=info:doi/10.1044%2F2024_JSLHR-24-00404&rft.externalDBID=n%2Fa&rft.externalDocID=10_1044_2024_JSLHR_24_00404
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1092-4388&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1092-4388&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1092-4388&client=summon