Room-localized speech activity detection in multi-microphone smart homes

Voice-enabled interaction systems in domestic environments have attracted significant interest recently, being the focus of smart home research projects and commercial voice assistant home devices. Within the multi-module pipelines of such systems, speech activity detection (SAD) constitutes a cruci...

Full description

Saved in:
Bibliographic Details
Published inEURASIP journal on audio, speech, and music processing Vol. 2019; no. 1; pp. 1 - 23
Main Authors Giannoulis, Panagiotis, Potamianos, Gerasimos, Maragos, Petros
Format Journal Article
LanguageEnglish
Published Cham Springer International Publishing 27.08.2019
Springer Nature B.V
SpringerOpen
Subjects
Online AccessGet full text
ISSN1687-4722
1687-4714
1687-4722
DOI10.1186/s13636-019-0158-8

Cover

Loading…
Abstract Voice-enabled interaction systems in domestic environments have attracted significant interest recently, being the focus of smart home research projects and commercial voice assistant home devices. Within the multi-module pipelines of such systems, speech activity detection (SAD) constitutes a crucial component, providing input to their activation and speech recognition subsystems. In typical multi-room domestic environments, SAD may also convey spatial intelligence to the interaction, in addition to its traditional temporal segmentation output, by assigning speech activity at the room level. Such room-localized SAD can, for example, disambiguate user command referents, allow localized system feedback, and enable parallel voice interaction sessions by multiple subjects in different rooms. In this paper, we investigate a room-localized SAD system for smart homes equipped with multiple microphones distributed in multiple rooms, significantly extending our earlier work. The system employs a two-stage algorithm, incorporating a set of hand-crafted features specially designed to discriminate room-inside vs. room-outside speech at its second stage, refining SAD hypotheses obtained at its first stage by traditional statistical modeling and acoustic front-end processing. Both algorithmic stages exploit multi-microphone information, combining it at the signal, feature, or decision level. The proposed approach is extensively evaluated on both simulated and real data recorded in a multi-room, multi-microphone smart home, significantly outperforming alternative baselines. Further, it remains robust to reduced microphone setups, while also comparing favorably to deep learning-based alternatives.
AbstractList Voice-enabled interaction systems in domestic environments have attracted significant interest recently, being the focus of smart home research projects and commercial voice assistant home devices. Within the multi-module pipelines of such systems, speech activity detection (SAD) constitutes a crucial component, providing input to their activation and speech recognition subsystems. In typical multi-room domestic environments, SAD may also convey spatial intelligence to the interaction, in addition to its traditional temporal segmentation output, by assigning speech activity at the room level. Such room-localized SAD can, for example, disambiguate user command referents, allow localized system feedback, and enable parallel voice interaction sessions by multiple subjects in different rooms. In this paper, we investigate a room-localized SAD system for smart homes equipped with multiple microphones distributed in multiple rooms, significantly extending our earlier work. The system employs a two-stage algorithm, incorporating a set of hand-crafted features specially designed to discriminate room-inside vs. room-outside speech at its second stage, refining SAD hypotheses obtained at its first stage by traditional statistical modeling and acoustic front-end processing. Both algorithmic stages exploit multi-microphone information, combining it at the signal, feature, or decision level. The proposed approach is extensively evaluated on both simulated and real data recorded in a multi-room, multi-microphone smart home, significantly outperforming alternative baselines. Further, it remains robust to reduced microphone setups, while also comparing favorably to deep learning-based alternatives.
Abstract Voice-enabled interaction systems in domestic environments have attracted significant interest recently, being the focus of smart home research projects and commercial voice assistant home devices. Within the multi-module pipelines of such systems, speech activity detection (SAD) constitutes a crucial component, providing input to their activation and speech recognition subsystems. In typical multi-room domestic environments, SAD may also convey spatial intelligence to the interaction, in addition to its traditional temporal segmentation output, by assigning speech activity at the room level. Such room-localized SAD can, for example, disambiguate user command referents, allow localized system feedback, and enable parallel voice interaction sessions by multiple subjects in different rooms. In this paper, we investigate a room-localized SAD system for smart homes equipped with multiple microphones distributed in multiple rooms, significantly extending our earlier work. The system employs a two-stage algorithm, incorporating a set of hand-crafted features specially designed to discriminate room-inside vs. room-outside speech at its second stage, refining SAD hypotheses obtained at its first stage by traditional statistical modeling and acoustic front-end processing. Both algorithmic stages exploit multi-microphone information, combining it at the signal, feature, or decision level. The proposed approach is extensively evaluated on both simulated and real data recorded in a multi-room, multi-microphone smart home, significantly outperforming alternative baselines. Further, it remains robust to reduced microphone setups, while also comparing favorably to deep learning-based alternatives.
ArticleNumber 15
Author Maragos, Petros
Potamianos, Gerasimos
Giannoulis, Panagiotis
Author_xml – sequence: 1
  givenname: Panagiotis
  orcidid: 0000-0001-5446-0457
  surname: Giannoulis
  fullname: Giannoulis, Panagiotis
  email: pangian@cs.ntua.gr
  organization: School of Electrical and Computer Engineering, National Technical University of Athens, Athena Research and Innovation Center
– sequence: 2
  givenname: Gerasimos
  surname: Potamianos
  fullname: Potamianos, Gerasimos
  organization: Department of Electrical and Computer Engineering, University of Thessaly, Athena Research and Innovation Center
– sequence: 3
  givenname: Petros
  surname: Maragos
  fullname: Maragos, Petros
  organization: School of Electrical and Computer Engineering, National Technical University of Athens, Athena Research and Innovation Center
BookMark eNp9kMtKxDAUhoMoqKMP4K7gupp7MksRbyAIouuQpqdOhrYZk4ygT2_Gioqgi5DDIf8l3z7aHsMICB0RfEKIlqeJMMlkjcm8HKFrvYX2iNSq5orS7R_zLtpPaYmxYILTPXR9H8JQ98HZ3r9BW6UVgFtU1mX_4vNr1UKGMoex8mM1rPvs68G7GFaLkl-lwcZcLcIA6QDtdLZPcPh5z9Dj5cXD-XV9e3d1c352WztORa4ZUC05JkpTAq5pGs4AuGbOYteC0lwp3gliWaOIswpaBQxb0XHnmrl1HZuhm8m3DXZpVtGXCq8mWG8-FiE-mdLJux4MI420oqQJKzkTzbxtHAgFFHdWMk2K1_HktYrheQ0pm2VYx7HUN5RqLKlWheoMkelV-XZKEbqvVILNBr6Z4JsC32zgG1006pfG-Ww3HHO0vv9XSSdlKinjE8TvTn-L3gGCz5sn
CitedBy_id crossref_primary_10_1186_s13636_024_00386_y
crossref_primary_10_3389_fdata_2024_1419562
Cites_doi 10.1007/978-3-662-04619-7_8
10.1007/978-3-319-56904-8_16
10.1016/j.specom.2018.11.002
10.1109/TSA.2005.858055
10.1109/LSP.2013.2237903
10.1016/j.eswa.2015.02.036
10.4018/jaci.2009062203
10.1109/97.736233
10.1049/ip-i-2.1992.0052
10.1109/TASE.2008.2004965
10.1007/978-3-642-34898-3_14
10.1504/IJCAT.2010.034727
10.1109/TASL.2012.2229986
10.1109/TASL.2010.2052803
10.1109/35.620527
10.1364/JOSAA.12.001867
10.1016/j.maturitas.2011.03.016
10.1109/TSMCC.2012.2189204
10.1016/j.csl.2017.12.002
10.1016/j.maturitas.2009.07.014
10.1145/2738047
10.1109/TASL.2006.872625
10.1007/s12652-015-0270-2
10.1109/TASSP.1977.1162964
10.1109/TASLP.2015.2505415
10.1006/dspr.1999.0361
10.1016/j.specom.2003.10.002
10.1016/j.csl.2017.02.004
ContentType Journal Article
Copyright The Author(s) 2019
EURASIP Journal on Audio, Speech, and Music Processing is a copyright of Springer, (2019). All Rights Reserved. © 2019. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: The Author(s) 2019
– notice: EURASIP Journal on Audio, Speech, and Music Processing is a copyright of Springer, (2019). All Rights Reserved. © 2019. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID C6C
AAYXX
CITATION
8FE
8FG
ABUWG
AFKRA
ARAPS
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
HCIFZ
P5Z
P62
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
DOA
DOI 10.1186/s13636-019-0158-8
DatabaseName Springer Nature OA Free Journals (Freely Accessible)
CrossRef
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Central (subscription)
Technology collection
ProQuest One Community College
ProQuest Central Korea
ProQuest SciTech Premium Collection
ProQuest Advanced Technologies & Aerospace Database (NC LIVE)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Premium
ProQuest One Academic (New)
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
Publicly Available Content Database
Advanced Technologies & Aerospace Collection
Technology Collection
ProQuest One Academic Middle East (New)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest One Academic Eastern Edition
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Technology Collection
ProQuest SciTech Collection
ProQuest Central China
ProQuest Central
Advanced Technologies & Aerospace Database
ProQuest One Applied & Life Sciences
ProQuest One Academic UKI Edition
ProQuest Central Korea
ProQuest Central (New)
ProQuest One Academic
ProQuest One Academic (New)
DatabaseTitleList Publicly Available Content Database


Database_xml – sequence: 1
  dbid: C6C
  name: Springer Nature OA Free Journals
  url: http://www.springeropen.com/
  sourceTypes: Publisher
– sequence: 2
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 3
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1687-4722
EndPage 23
ExternalDocumentID oai_doaj_org_article_31b6a56405a6435b9dbce57e20fa6381
10_1186_s13636_019_0158_8
GrantInformation_xml – fundername: Horizon 2020 Framework Programme
  grantid: 687831
  funderid: http://dx.doi.org/10.13039/100010661
GroupedDBID -A0
.4S
.DC
0R~
29G
2WC
4.4
40G
5GY
5VS
6OB
8FE
8FG
8R4
8R5
AAFWJ
AAJSJ
AAKKN
ABEEZ
ACACY
ACGFO
ACGFS
ACULB
ADBBV
ADINQ
ADMLS
AENEX
AFGXO
AFKRA
AFPKN
AHBYD
AHYZX
AIAGR
ALMA_UNASSIGNED_HOLDINGS
AMKLP
AMTXH
ARAPS
ARCSS
BAPOH
BCNDV
BENPR
BGLVJ
C24
C6C
CCPQU
CS3
E3Z
EBLON
EBS
EDO
EJD
GROUPED_DOAJ
GX1
HCIFZ
I-F
KQ8
M~E
OK1
P2P
P62
PIMPY
PROAC
Q2X
RHU
RNS
RSV
SEG
SOJ
TUS
U2A
AASML
AAYXX
CITATION
OVT
PHGZM
PHGZT
ABUWG
AZQEC
DWQXO
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PUEGO
ID FETCH-LOGICAL-c425t-3e2864017821ecbbb43ee483ca0cde784774f51a3b71ca7ed7e30a5f4ccb9acf3
IEDL.DBID C6C
ISSN 1687-4722
1687-4714
IngestDate Wed Aug 27 01:24:08 EDT 2025
Fri Jul 25 06:45:01 EDT 2025
Thu Apr 24 23:08:57 EDT 2025
Tue Jul 01 01:47:21 EDT 2025
Fri Feb 21 02:32:04 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Keywords Speech activity detection
Microphone arrays
Active room selection
Multi-channel fusion
Smart homes
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c425t-3e2864017821ecbbb43ee483ca0cde784774f51a3b71ca7ed7e30a5f4ccb9acf3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0001-5446-0457
OpenAccessLink https://doi.org/10.1186/s13636-019-0158-8
PQID 2280628736
PQPubID 237298
PageCount 23
ParticipantIDs doaj_primary_oai_doaj_org_article_31b6a56405a6435b9dbce57e20fa6381
proquest_journals_2280628736
crossref_primary_10_1186_s13636_019_0158_8
crossref_citationtrail_10_1186_s13636_019_0158_8
springer_journals_10_1186_s13636_019_0158_8
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2019-08-27
PublicationDateYYYYMMDD 2019-08-27
PublicationDate_xml – month: 08
  year: 2019
  text: 2019-08-27
  day: 27
PublicationDecade 2010
PublicationPlace Cham
PublicationPlace_xml – name: Cham
– name: New York
PublicationTitle EURASIP journal on audio, speech, and music processing
PublicationTitleAbbrev J AUDIO SPEECH MUSIC PROC
PublicationYear 2019
Publisher Springer International Publishing
Springer Nature B.V
SpringerOpen
Publisher_xml – name: Springer International Publishing
– name: Springer Nature B.V
– name: SpringerOpen
References Chan, Campo, Estève, Fourniols (CR1) 2009; 64
Alam, Reaz, Mohd Ali (CR4) 2012; 42
Rabiner, Juang (CR72) 1993
Maragos, Bovik (CR74) 1995; 12
CR39
Evangelopoulos, Maragos (CR50) 2006; 14
Poland, Nugent, Wang, Chen (CR2) 2009; 1
CR35
CR34
CR33
CR77
CR32
CR30
CR73
CR70
Tucker (CR41) 1992; 139
Filho, Moir (CR8) 2010; 39
Amiribesheli, Benmansour, Bouchachia (CR5) 2015; 6
Principi, Squartini, Bonfigli, Ferroni, Piazza (CR17) 2015; 42
CR6
Reynolds, Quatieri, Dunn (CR23) 2000; 10
CR7
Vesperini, Vecchiotti, Principi, Squartini, Piazza (CR71) 2018; 49
CR9
Ding, Cooper, Pasquina, Fici-Psquina (CR3) 2011; 69
Vacher, Caffiau, Portet, Meillon, Roux, Elias, Lecouteux, Chahuara (CR12) 2015; 7
CR48
CR45
Brdiczka, Langet, Maisonnasse, Crowley (CR69) 2009; 6
Sohn, Kim, Sung (CR36) 1999; 6
CR42
Rodomagoulakis, Katsamanis, Potamianos, Giannoulis, Tsiami, Maragos (CR16) 2017; 46
CR40
Sadjadi, Hansen (CR43) 2013; 20
DiBiase, Silverman, Brandstein, Brandstein, Ward (CR75) 2001
Benyassine, Shlomot, Su, Massaloux, Lamblin, Petit (CR20) 1997; 35
Bertin, Camberlein, Lebarbenchon, Vincent, Sivasankaran, Illina, Bimbot (CR65) 2019; 106
Sehili, Lecouteux, Vacher, Portet, Istrate, Dorizzi, Boudy, Paternò, de Ruyter, Markopoulos, Santoro, van Loenen, Luyten (CR67) 2012
CR19
CR18
CR15
CR59
CR14
CR58
CR57
CR56
CR11
Ma, Nishihara (CR47) 2013; 2013
CR55
CR10
CR54
Mesgarani, Slaney, Shamma (CR49) 2006; 14
CR51
Vecchiotti, Vesperini, Principi, Squartini, Piazza, Esposito, Faudez-Zanuy, Morabito, Pasero (CR31) 2018
Zhang, Wang (CR53) 2016; 24
Ramírez, Segura, Benítez, de la Torre, Rubio (CR38) 2004; 42
Ghosh, Tsiartas, Narayanan (CR46) 2011; 19
CR29
CR28
CR27
CR26
Rabiner, Sambur (CR44) 1977; 25
CR25
CR24
CR68
CR22
CR66
CR21
CR64
CR63
CR62
Malavasi, Turri, Atria, Christensen, Marxer, Desideri, Coy, Tamburini, Green (CR13) 2017; 242
CR61
CR60
Graf, Herbig, Buck, Schmidt (CR37) 2015; 2015
Zhang, Wu (CR52) 2013; 21
Theodoridis, Koutroumbas (CR76) 2009
S. O. Sadjadi (158_CR43) 2013; 20
M. P. Poland (158_CR2) 2009; 1
M. R. Alam (158_CR4) 2012; 42
158_CR18
158_CR19
M. Vacher (158_CR12) 2015; 7
158_CR25
158_CR24
G. Evangelopoulos (158_CR50) 2006; 14
158_CR68
J. H. DiBiase (158_CR75) 2001
158_CR27
158_CR26
158_CR21
158_CR64
O. Brdiczka (158_CR69) 2009; 6
158_CR22
P. K. Ghosh (158_CR46) 2011; 19
158_CR66
158_CR61
158_CR60
M. A. Sehili (158_CR67) 2012
158_CR63
158_CR62
J. Ramírez (158_CR38) 2004; 42
F. Vesperini (158_CR71) 2018; 49
M. Amiribesheli (158_CR5) 2015; 6
A. Benyassine (158_CR20) 1997; 35
158_CR14
158_CR58
158_CR57
158_CR15
158_CR59
158_CR10
158_CR54
S. Theodoridis (158_CR76) 2009
D. A. Reynolds (158_CR23) 2000; 10
J. Sohn (158_CR36) 1999; 6
158_CR56
158_CR11
158_CR55
158_CR51
158_CR9
158_CR7
158_CR6
M. Malavasi (158_CR13) 2017; 242
M. Chan (158_CR1) 2009; 64
P. Vecchiotti (158_CR31) 2018
E. Principi (158_CR17) 2015; 42
158_CR39
L. R. Rabiner (158_CR44) 1977; 25
158_CR48
X. -L. Zhang (158_CR53) 2016; 24
158_CR42
158_CR45
L. Rabiner (158_CR72) 1993
Y. Ma (158_CR47) 2013; 2013
158_CR40
I. Rodomagoulakis (158_CR16) 2017; 46
S. Graf (158_CR37) 2015; 2015
G. L. Filho (158_CR8) 2010; 39
N. Bertin (158_CR65) 2019; 106
P. Maragos (158_CR74) 1995; 12
158_CR29
158_CR28
D. Ding (158_CR3) 2011; 69
R. Tucker (158_CR41) 1992; 139
158_CR35
158_CR32
158_CR34
158_CR33
158_CR77
158_CR30
N. Mesgarani (158_CR49) 2006; 14
158_CR73
X. -L. Zhang (158_CR52) 2013; 21
158_CR70
References_xml – ident: CR45
– ident: CR70
– ident: CR22
– ident: CR68
– year: 1993
  ident: CR72
  publication-title: Fundamentals of Speech Recognition
– start-page: 157
  year: 2001
  end-page: 180
  ident: CR75
  article-title: Robust localization in reverberant rooms
  publication-title: Microphone Arrays: Signal Processing Techniques and Applications
  doi: 10.1007/978-3-662-04619-7_8
– ident: CR39
– ident: CR51
– start-page: 161
  year: 2018
  end-page: 170
  ident: CR31
  article-title: Convolutional neural networks with 3-D kernels for voice activity detection in a multiroom environment
  publication-title: Multidisciplinary Approaches to Neural Computing, vol. SIST-69
  doi: 10.1007/978-3-319-56904-8_16
– volume: 106
  start-page: 68
  year: 2019
  end-page: 78
  ident: CR65
  article-title: VoiceHome-2, an extended corpus for multichannel speech processing in real homes
  publication-title: Speech Comm.
  doi: 10.1016/j.specom.2018.11.002
– ident: CR35
– ident: CR29
– ident: CR54
– ident: CR61
– ident: CR77
– volume: 14
  start-page: 920
  issue: 3
  year: 2006
  end-page: 930
  ident: CR49
  article-title: Discrimination of speech from nonspeech based on multiscale spectro-temporal modulations
  publication-title: IEEE Trans. Audio Speech Lang. Process.
  doi: 10.1109/TSA.2005.858055
– ident: CR58
– ident: CR25
– ident: CR42
– volume: 20
  start-page: 197
  issue: 3
  year: 2013
  end-page: 200
  ident: CR43
  article-title: Unsupervised speech activity detection using voicing measures and perceptual spectral flux
  publication-title: IEEE Signal Process. Lett.
  doi: 10.1109/LSP.2013.2237903
– ident: CR21
– volume: 2015
  start-page: 1
  issue: 91
  year: 2015
  end-page: 15
  ident: CR37
  article-title: Features for voice activity detection: a comparative analysis
  publication-title: EURASIP J. Adv. Signal Process.
– ident: CR19
– ident: CR15
– volume: 42
  start-page: 5668
  issue: 13
  year: 2015
  end-page: 5683
  ident: CR17
  article-title: An integrated system for voice command recognition and emergency detection based on audio signals
  publication-title: Expert Syst. Appl.
  doi: 10.1016/j.eswa.2015.02.036
– volume: 1
  start-page: 32
  issue: 4
  year: 2009
  end-page: 45
  ident: CR2
  article-title: Smart home research: projects and issues
  publication-title: Int. J. Ambient Comput. Intell.
  doi: 10.4018/jaci.2009062203
– ident: CR11
– ident: CR9
– ident: CR57
– ident: CR32
– volume: 6
  start-page: 1
  issue: 1
  year: 1999
  end-page: 3
  ident: CR36
  article-title: A statistical model-based voice activity detection
  publication-title: IEEE Signal Process. Lett.
  doi: 10.1109/97.736233
– volume: 139
  start-page: 377
  issue: 4
  year: 1992
  end-page: 380
  ident: CR41
  article-title: Voice activity detection using a periodicity measure
  publication-title: IEEE Proc. I Commun. Speech Vis.
  doi: 10.1049/ip-i-2.1992.0052
– ident: CR60
– volume: 6
  start-page: 588
  issue: 4
  year: 2009
  end-page: 597
  ident: CR69
  article-title: Detecting human behavior models from multimodal observation in a smart home
  publication-title: IEEE Trans. Autom. Sci. Eng.
  doi: 10.1109/TASE.2008.2004965
– ident: CR64
– ident: CR26
– year: 2009
  ident: CR76
  publication-title: Pattern Recognition
– ident: CR18
– ident: CR66
– start-page: 208
  year: 2012
  end-page: 223
  ident: CR67
  article-title: Sound environment analysis in smart home
  publication-title: Ambient Intelligence: Third International Joint Conference, AmI 2012 Proceedings, vol. LNCS-7683
  doi: 10.1007/978-3-642-34898-3_14
– volume: 39
  start-page: 32
  issue: 1/2/3
  year: 2010
  end-page: 39
  ident: CR8
  article-title: From science fiction to science fact: a smart-house interface using speech technology and a photo-realistic avatar
  publication-title: Int. J. Comput. Appl. Technol.
  doi: 10.1504/IJCAT.2010.034727
– volume: 242
  start-page: 306
  year: 2017
  end-page: 313
  ident: CR13
  article-title: An innovative speech-based user interface for smarthomes and IoT solutions to help people with speech and motor disabilities
  publication-title: Stud. Health Technol. Inform.
– volume: 21
  start-page: 697
  issue: 4
  year: 2013
  end-page: 710
  ident: CR52
  article-title: Deep belief networks based voice activity detection
  publication-title: IEEE Trans. Audio Speech Lang. Process.
  doi: 10.1109/TASL.2012.2229986
– ident: CR14
– ident: CR30
– volume: 2013
  start-page: 1
  issue: 21
  year: 2013
  end-page: 18
  ident: CR47
  article-title: Efficient voice activity detection algorithm using long-term spectral flatness measure
  publication-title: EURASIP J. Audio Speech Music Process.
– ident: CR10
– ident: CR33
– volume: 19
  start-page: 600
  issue: 3
  year: 2011
  end-page: 613
  ident: CR46
  article-title: Robust voice activity detection using long-term signal variability
  publication-title: IEEE Trans. Audio Speech Lang. Process.
  doi: 10.1109/TASL.2010.2052803
– volume: 35
  start-page: 64
  issue: 9
  year: 1997
  end-page: 73
  ident: CR20
  article-title: ITU-T Recommendation G.729 Annex B: a silence compression scheme for use with G.729 optimized for V.70 digital simultaneous voice and data applications
  publication-title: IEEE Commun. Mag.
  doi: 10.1109/35.620527
– ident: CR6
– ident: CR56
– volume: 12
  start-page: 1867
  issue: 9
  year: 1995
  end-page: 1876
  ident: CR74
  article-title: Image demodulation using multidimensional energy separation
  publication-title: J. Opt. Soc. Am. A
  doi: 10.1364/JOSAA.12.001867
– volume: 69
  start-page: 131
  issue: 2
  year: 2011
  end-page: 136
  ident: CR3
  article-title: Sensor technology for smart homes
  publication-title: Maturitas
  doi: 10.1016/j.maturitas.2011.03.016
– ident: CR40
– ident: CR63
– volume: 42
  start-page: 1190
  issue: 6
  year: 2012
  end-page: 1203
  ident: CR4
  article-title: A review of smart homes – past, present, and future
  publication-title: IEEE Trans. Syst. Man Cybern. Part C Appl. Rev.
  doi: 10.1109/TSMCC.2012.2189204
– ident: CR27
– volume: 49
  start-page: 83
  year: 2018
  end-page: 106
  ident: CR71
  article-title: Localizing speakers in multiple rooms by using deep neural networks
  publication-title: Comput. Speech Lang.
  doi: 10.1016/j.csl.2017.12.002
– ident: CR48
– ident: CR73
– volume: 64
  start-page: 90
  issue: 2
  year: 2009
  end-page: 97
  ident: CR1
  article-title: Smart homes – current features and future perspectives
  publication-title: Maturitas
  doi: 10.1016/j.maturitas.2009.07.014
– volume: 7
  start-page: 1
  issue: 2:5
  year: 2015
  end-page: 36
  ident: CR12
  article-title: Evaluation of a context-aware voice interface for ambient assisted living: qualitative user study vs. quantitative system evaluation
  publication-title: ACM Trans. Accessible Comput.
  doi: 10.1145/2738047
– volume: 14
  start-page: 2024
  issue: 6
  year: 2006
  end-page: 2038
  ident: CR50
  article-title: Multiband modulation energy tracking for noisy speech detection
  publication-title: IEEE Trans. Audio Speech Lang. Process.
  doi: 10.1109/TASL.2006.872625
– volume: 6
  start-page: 495
  issue: 4
  year: 2015
  end-page: 517
  ident: CR5
  article-title: A review of smart homes in healthcare
  publication-title: J. Ambient Intell. Humanized Comput.
  doi: 10.1007/s12652-015-0270-2
– ident: CR34
– volume: 25
  start-page: 338
  issue: 4
  year: 1977
  end-page: 343
  ident: CR44
  article-title: Application of an LPC distance measure to the voiced-unvoiced-silence detection problem
  publication-title: IEEE Trans. Acoust. Speech Signal Proc.
  doi: 10.1109/TASSP.1977.1162964
– volume: 24
  start-page: 252
  issue: 2
  year: 2016
  end-page: 264
  ident: CR53
  article-title: Boosting contextual information for deep neural network based voice activity detection
  publication-title: IEEE/ACM Trans. Audio Speech Lang. Process.
  doi: 10.1109/TASLP.2015.2505415
– ident: CR55
– ident: CR7
– ident: CR59
– volume: 10
  start-page: 19
  issue: 1
  year: 2000
  end-page: 41
  ident: CR23
  article-title: Speaker verification using adapted Gaussian mixture models
  publication-title: Digit. Signal Process.
  doi: 10.1006/dspr.1999.0361
– ident: CR28
– volume: 42
  start-page: 271
  issue: 3–4
  year: 2004
  end-page: 287
  ident: CR38
  article-title: Efficient voice activity detection algorithms using long-term speech information
  publication-title: Speech Comm.
  doi: 10.1016/j.specom.2003.10.002
– ident: CR62
– volume: 46
  start-page: 419
  year: 2017
  end-page: 443
  ident: CR16
  article-title: Room-localized spoken command recognition in multi-room, multi-microphone environments
  publication-title: Comput. Speech Lang.
  doi: 10.1016/j.csl.2017.02.004
– ident: CR24
– ident: 158_CR19
– volume: 14
  start-page: 2024
  issue: 6
  year: 2006
  ident: 158_CR50
  publication-title: IEEE Trans. Audio Speech Lang. Process.
  doi: 10.1109/TASL.2006.872625
– start-page: 208
  volume-title: Ambient Intelligence: Third International Joint Conference, AmI 2012 Proceedings, vol. LNCS-7683
  year: 2012
  ident: 158_CR67
  doi: 10.1007/978-3-642-34898-3_14
– ident: 158_CR25
– ident: 158_CR21
– ident: 158_CR15
– ident: 158_CR54
– ident: 158_CR73
– ident: 158_CR77
– ident: 158_CR34
– volume: 139
  start-page: 377
  issue: 4
  year: 1992
  ident: 158_CR41
  publication-title: IEEE Proc. I Commun. Speech Vis.
  doi: 10.1049/ip-i-2.1992.0052
– volume: 6
  start-page: 1
  issue: 1
  year: 1999
  ident: 158_CR36
  publication-title: IEEE Signal Process. Lett.
  doi: 10.1109/97.736233
– ident: 158_CR57
– ident: 158_CR11
– ident: 158_CR30
– ident: 158_CR40
– ident: 158_CR7
– ident: 158_CR29
– ident: 158_CR63
– volume: 2013
  start-page: 1
  issue: 21
  year: 2013
  ident: 158_CR47
  publication-title: EURASIP J. Audio Speech Music Process.
– volume: 1
  start-page: 32
  issue: 4
  year: 2009
  ident: 158_CR2
  publication-title: Int. J. Ambient Comput. Intell.
  doi: 10.4018/jaci.2009062203
– volume: 46
  start-page: 419
  year: 2017
  ident: 158_CR16
  publication-title: Comput. Speech Lang.
  doi: 10.1016/j.csl.2017.02.004
– volume: 35
  start-page: 64
  issue: 9
  year: 1997
  ident: 158_CR20
  publication-title: IEEE Commun. Mag.
  doi: 10.1109/35.620527
– volume: 42
  start-page: 271
  issue: 3–4
  year: 2004
  ident: 158_CR38
  publication-title: Speech Comm.
  doi: 10.1016/j.specom.2003.10.002
– ident: 158_CR48
– volume-title: Pattern Recognition
  year: 2009
  ident: 158_CR76
– volume: 6
  start-page: 495
  issue: 4
  year: 2015
  ident: 158_CR5
  publication-title: J. Ambient Intell. Humanized Comput.
  doi: 10.1007/s12652-015-0270-2
– volume: 10
  start-page: 19
  issue: 1
  year: 2000
  ident: 158_CR23
  publication-title: Digit. Signal Process.
  doi: 10.1006/dspr.1999.0361
– ident: 158_CR45
– ident: 158_CR18
– volume: 12
  start-page: 1867
  issue: 9
  year: 1995
  ident: 158_CR74
  publication-title: J. Opt. Soc. Am. A
  doi: 10.1364/JOSAA.12.001867
– ident: 158_CR24
– volume: 42
  start-page: 5668
  issue: 13
  year: 2015
  ident: 158_CR17
  publication-title: Expert Syst. Appl.
  doi: 10.1016/j.eswa.2015.02.036
– ident: 158_CR39
– ident: 158_CR51
– start-page: 161
  volume-title: Multidisciplinary Approaches to Neural Computing, vol. SIST-69
  year: 2018
  ident: 158_CR31
  doi: 10.1007/978-3-319-56904-8_16
– ident: 158_CR55
– volume: 64
  start-page: 90
  issue: 2
  year: 2009
  ident: 158_CR1
  publication-title: Maturitas
  doi: 10.1016/j.maturitas.2009.07.014
– ident: 158_CR10
– ident: 158_CR35
– ident: 158_CR14
– volume: 39
  start-page: 32
  issue: 1/2/3
  year: 2010
  ident: 158_CR8
  publication-title: Int. J. Comput. Appl. Technol.
  doi: 10.1504/IJCAT.2010.034727
– ident: 158_CR6
– ident: 158_CR28
– ident: 158_CR62
– volume: 20
  start-page: 197
  issue: 3
  year: 2013
  ident: 158_CR43
  publication-title: IEEE Signal Process. Lett.
  doi: 10.1109/LSP.2013.2237903
– ident: 158_CR66
– volume: 69
  start-page: 131
  issue: 2
  year: 2011
  ident: 158_CR3
  publication-title: Maturitas
  doi: 10.1016/j.maturitas.2011.03.016
– volume: 242
  start-page: 306
  year: 2017
  ident: 158_CR13
  publication-title: Stud. Health Technol. Inform.
– start-page: 157
  volume-title: Microphone Arrays: Signal Processing Techniques and Applications
  year: 2001
  ident: 158_CR75
  doi: 10.1007/978-3-662-04619-7_8
– volume: 24
  start-page: 252
  issue: 2
  year: 2016
  ident: 158_CR53
  publication-title: IEEE/ACM Trans. Audio Speech Lang. Process.
  doi: 10.1109/TASLP.2015.2505415
– ident: 158_CR56
– volume-title: Fundamentals of Speech Recognition
  year: 1993
  ident: 158_CR72
– ident: 158_CR59
– volume: 42
  start-page: 1190
  issue: 6
  year: 2012
  ident: 158_CR4
  publication-title: IEEE Trans. Syst. Man Cybern. Part C Appl. Rev.
  doi: 10.1109/TSMCC.2012.2189204
– ident: 158_CR61
– ident: 158_CR9
– ident: 158_CR32
– ident: 158_CR27
– ident: 158_CR42
– ident: 158_CR22
– volume: 6
  start-page: 588
  issue: 4
  year: 2009
  ident: 158_CR69
  publication-title: IEEE Trans. Autom. Sci. Eng.
  doi: 10.1109/TASE.2008.2004965
– ident: 158_CR70
– volume: 21
  start-page: 697
  issue: 4
  year: 2013
  ident: 158_CR52
  publication-title: IEEE Trans. Audio Speech Lang. Process.
  doi: 10.1109/TASL.2012.2229986
– volume: 25
  start-page: 338
  issue: 4
  year: 1977
  ident: 158_CR44
  publication-title: IEEE Trans. Acoust. Speech Signal Proc.
  doi: 10.1109/TASSP.1977.1162964
– volume: 2015
  start-page: 1
  issue: 91
  year: 2015
  ident: 158_CR37
  publication-title: EURASIP J. Adv. Signal Process.
– volume: 19
  start-page: 600
  issue: 3
  year: 2011
  ident: 158_CR46
  publication-title: IEEE Trans. Audio Speech Lang. Process.
  doi: 10.1109/TASL.2010.2052803
– ident: 158_CR58
– ident: 158_CR33
– volume: 14
  start-page: 920
  issue: 3
  year: 2006
  ident: 158_CR49
  publication-title: IEEE Trans. Audio Speech Lang. Process.
  doi: 10.1109/TSA.2005.858055
– volume: 49
  start-page: 83
  year: 2018
  ident: 158_CR71
  publication-title: Comput. Speech Lang.
  doi: 10.1016/j.csl.2017.12.002
– ident: 158_CR60
– volume: 7
  start-page: 1
  issue: 2:5
  year: 2015
  ident: 158_CR12
  publication-title: ACM Trans. Accessible Comput.
  doi: 10.1145/2738047
– ident: 158_CR64
– ident: 158_CR26
– volume: 106
  start-page: 68
  year: 2019
  ident: 158_CR65
  publication-title: Speech Comm.
  doi: 10.1016/j.specom.2018.11.002
– ident: 158_CR68
SSID ssj0053542
ssib044736451
ssib008501525
Score 2.1272976
Snippet Voice-enabled interaction systems in domestic environments have attracted significant interest recently, being the focus of smart home research projects and...
Abstract Voice-enabled interaction systems in domestic environments have attracted significant interest recently, being the focus of smart home research...
SourceID doaj
proquest
crossref
springer
SourceType Open Website
Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 1
SubjectTerms Acoustics
Active room selection
Algorithms
Computer simulation
Engineering
Engineering Acoustics
Machine learning
Mathematics in Music
Microphone arrays
Microphones
Multi-channel fusion
Research projects
Segmentation
Signal,Image and Speech Processing
Smart buildings
Smart homes
Smart houses
Speech activity detection
Speech recognition
Statistical models
Subsystems
Voice activity detectors
SummonAdditionalLinks – databaseName: DOAJ Directory of Open Access Journals
  dbid: DOA
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3PS8MwFA7iSQ_iT5xOycGTUtY0TdIdVRxD0IM42C0k6QsbuDnsvPjX-5J2cxPUi9c2oemXl3zvIy_vEXLhvQvXJ30ivUWBInDN2VzxBGxuXMm8yaJQfHiU_UF-PxTDlVJfISasTg9cA9fhzEojJPoVBslT2G5pHQgFWeoN2k4UPsh5CzFV78GCizxrzjBZITsV45IH5Rxig0SRFGssFJP1r3mY3w5FI9f0dslO4yTS63pwe2QDpvtkeyV14AHpP6HHm0QmGn9ASasZgBvRcE0hVIOgJcxjkNWUjqc0Rg0mkxB7F0LRgVYT_HE6ep1AdUgGvbvn237SVEVIHK6vecIhKxAOhtTOwFlrcw6QF9yZ1JWgkG1U7gUz3CrmjIJSAU-N8Llztmuc50dkc4qfOibUlgHSjNvSoExUmQXke25sl4X7sCptkXSBknZNyvBQueJFR-lQSF0DqxFYHYDVRYtcLrvM6nwZvzW-CdAvG4ZU1_EBGoBuDED_ZQAt0l5MnG7WX6VDkh-JYpDLFrlaTObX6x9HdPIfIzolW1k0NdyDVJtszt_e4Qxdl7k9j1b6CSfZ6Ls
  priority: 102
  providerName: Directory of Open Access Journals
– databaseName: ProQuest Central (subscription)
  dbid: BENPR
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1Lb9QwELZge4ED4ikWCvKBE8hqHD_3hChqtUKiQhWVerP8GNNK7KPNcuHXM_Y6W4pEj0mcxB7P4xt7PEPIu5xjOT6Zmc4BHRSFMhekEQyC9DHx7PvqKH490fMz-eVcnbcFt6GFVY46sSrqtIpljfygpG3RCO-F_ri-YqVqVNldbSU07pM9VMFWTcje4dHJt9MdR1nVlQI_47XEbmipdkEgSqhaXodrFDVU07Lte3KrDwYutCjedoknUpbZW5arJvi_hUr_2Uit9un4MXnUgCX9tOWEJ-QeLJ-Sh3-lG3xG5qeIklm1Xpe_IdFhDRAvaDnaUCpI0ASbGpi1pJdLWiMN2aLE65XwdaDDArmMXqwWMDwnZ8dH3z_PWaukwCLK5IYJ6K1GTwrhAIcYQpACQFoRfRcTGLRQRmbFvQiGR28gGRCdV1nGGGY-ZvGCTJb4q5eEhoQYBr3XkDy6lqYPgBhB-DDj5Qyt6aakG6nkYkszXqpd_HTV3bDabQnrkLCuENbZKXm_e2W9zbFxV-PDQvpdw5Ieu95YXf9wTdqc4EF7hSNWvvQ2zFKIoAz0XfaocPiU7I8T55rMDu6Gw6bkwziZN4__26NXd3_sNXnQVyZCjWT2yWRz_QveIJDZhLeNW_8AgiDrAA
  priority: 102
  providerName: ProQuest
Title Room-localized speech activity detection in multi-microphone smart homes
URI https://link.springer.com/article/10.1186/s13636-019-0158-8
https://www.proquest.com/docview/2280628736
https://doaj.org/article/31b6a56405a6435b9dbce57e20fa6381
Volume 2019
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTxsxELZ4XNoDAtqqARr5wKnVquv1a3OEKCFCAiHUSLlZtncskJoUddNLD_x2ZpwNhapF4jLS7nq13hmPZz7Nw4wdpxSpfDIVJgUEKBp1LigrCwjKx0YkX2WgeHFpJlN1PtOzrlk01cI8jd-L2nxthTSSMC9l9ei6qDfZtqY2YxSXNcP1pqulVlUXtPzna8_MTu7O_8yl_CsKmo3LeJftdF4hP1mJcY9twGKfvX3SK_Adm1yji1tk03P7Gxre3gHEG051CXT8A29gmbOqFvx2wXOaYDGnZDvKPQfeznGJ8Jsfc2jfs-l49G04KbpjEIqICrUsJFS1QRiEtlxADCEoCaBqGX0ZG7BoXqxKWngZrIjeQmNBll4nFWMY-JjkB7a1wE99ZDw06IAg9AyNR1xoqwBo4KUPA0EFsLbssXLNJRe7HuF0VMV3l7FCbdyKsQ4Z64ixru6xz4-v3K0aZLw0-JRY_ziQelvnGyhy16mKkyIYr_GPtafZhkETImgLVZk87haix47WgnOdwrWOuvoYRH_S9NiXtTD_PP7vjA5eNfqQvanymsLdxR6xreXPX_AJnZJl6LNNVZ4hrcdIt09Hl1fXeDWsVD8v1H4G-0jPZgLpxf0I6bQ6eQCFFeEy
linkProvider Springer Nature
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwELZKewAOiKe6UMAHuICixvEre0BVC622tF2hqpV6M7YzoZXYB80iBD-qv7Ez3mRLkeitxySOY49nxjPxzHyMvanrSOmTdWbqgA6KRpkLysoMgvKxErUvkqN4MDSDY_X5RJ8ssYsuF4bCKjudmBR1NYn0j3ydyrYYNO-l2Zj-yAg1ik5XOwiNOVvswe9f6LI1H3Y_4fq-LYqd7aOPg6xFFcgi8ucsk1CURhEqfSEghhCUBFCljD6PFVjU1lbVWngZrIjeQmVB5l7XKsbQ97GW2O8dtoJmRh-laGVre_jlcMHBpc4JUKi7Vjhto_Qi6ERLneB8hEHRxm1BteesojTrjZBGkndP8Uu6zMprO2UCFLhmBf9zcJv2w52H7EFryPLNOec9Ykswfszu_1Xe8AkbHKJVnqXd8uwPVLyZAsRTTqkUhFjBK5ilQLAxPxvzFNmYjSg-kMLlgTcj5Gp-OhlB85Qd3wqNn7HlMX5qlfFQoc2E3nKoPLqytgiANon0oS8oZ9fmPZZ3VHKxLWtO6BrfXXJvSuPmhHVIWEeEdWWPvVu8Mp3X9Lip8RaRftGQynGnG5Pzb66VbidFMF7jjLWn0YZ-FSJoC0Vee1RwosfWuoVzrY5o3BVH99j7bjGvHv93RM9v7uw1uzs4Oth3-7vDvRfsXpEYCrWhXWPLs_Of8BKNqFl41XIuZ19vW1guARhwKYw
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwEB6VrYTggHiKhQI5wAUUbRy_sgeEKO1qS2FVVVTqzdjOmFZiHzSLEPw0fh1jb7KlSPTWYxInccbzjGfmA3gego_lkyFXwVGAIknmnNA8Ryesr1mwZQoUP07U-Ei8P5bHG_C7q4WJaZWdTkyKup77-I98ENu2KHLvuRqENi3iYGf0ZvEtjwhScae1g9NYscg-_vxB4Vvzem-H1vpFWY52P70b5y3CQO6JV5c5x7JSIiLUlwy9c05wRFFxbwtfoybNrUWQzHKnmbcaa428sDII793Q-sDpuddgU5NVrHqwub07OThcc3Mliwgu1B0LIoEScp2AIrlM0D5MkZiTiRDtniur1KBhXPEY6cdcJlnl1QWrmcAFLnjE_2ziJts4ug23Wqc2e7viwjuwgbO7cPOvVof3YHxIHnqeLOfpL6yzZoHoT7JYVhHRK7IalykpbJadzrKU5ZhPY65gTJ3HrJkSh2cn8yk29-HoSmj8AHozetVDyFxN_hNFzq62FNbq0iH5J9y6IYv1u7roQ9FRyfi2xXlE2vhqUqhTKbMirCHCmkhYU_Xh5fqWxaq_x2WDtyPp1wNja-50Yn72xbSSbjhzykr6YmnjbN2wdh6lxrIIlpQd68NWt3Cm1ReNOefuPrzqFvP88n9n9Ojyhz2D6yQk5sPeZP8x3CgTP5Fi1FvQW559xyfkTy3d05ZxM_h81bLyBwbcLbg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Room-localized+speech+activity+detection+in+multi-microphone+smart+homes&rft.jtitle=EURASIP+journal+on+audio%2C+speech%2C+and+music+processing&rft.au=Giannoulis%2C+Panagiotis&rft.au=Potamianos%2C+Gerasimos&rft.au=Maragos%2C+Petros&rft.date=2019-08-27&rft.pub=Springer+International+Publishing&rft.eissn=1687-4722&rft.volume=2019&rft.issue=1&rft_id=info:doi/10.1186%2Fs13636-019-0158-8&rft.externalDocID=10_1186_s13636_019_0158_8
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1687-4722&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1687-4722&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1687-4722&client=summon