Room-localized speech activity detection in multi-microphone smart homes
Voice-enabled interaction systems in domestic environments have attracted significant interest recently, being the focus of smart home research projects and commercial voice assistant home devices. Within the multi-module pipelines of such systems, speech activity detection (SAD) constitutes a cruci...
Saved in:
Published in | EURASIP journal on audio, speech, and music processing Vol. 2019; no. 1; pp. 1 - 23 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Cham
Springer International Publishing
27.08.2019
Springer Nature B.V SpringerOpen |
Subjects | |
Online Access | Get full text |
ISSN | 1687-4722 1687-4714 1687-4722 |
DOI | 10.1186/s13636-019-0158-8 |
Cover
Loading…
Abstract | Voice-enabled interaction systems in domestic environments have attracted significant interest recently, being the focus of smart home research projects and commercial voice assistant home devices. Within the multi-module pipelines of such systems, speech activity detection (SAD) constitutes a crucial component, providing input to their activation and speech recognition subsystems. In typical multi-room domestic environments, SAD may also convey spatial intelligence to the interaction, in addition to its traditional temporal segmentation output, by assigning speech activity at the room level. Such room-localized SAD can, for example, disambiguate user command referents, allow localized system feedback, and enable parallel voice interaction sessions by multiple subjects in different rooms. In this paper, we investigate a room-localized SAD system for smart homes equipped with multiple microphones distributed in multiple rooms, significantly extending our earlier work. The system employs a two-stage algorithm, incorporating a set of hand-crafted features specially designed to discriminate room-inside vs. room-outside speech at its second stage, refining SAD hypotheses obtained at its first stage by traditional statistical modeling and acoustic front-end processing. Both algorithmic stages exploit multi-microphone information, combining it at the signal, feature, or decision level. The proposed approach is extensively evaluated on both simulated and real data recorded in a multi-room, multi-microphone smart home, significantly outperforming alternative baselines. Further, it remains robust to reduced microphone setups, while also comparing favorably to deep learning-based alternatives. |
---|---|
AbstractList | Voice-enabled interaction systems in domestic environments have attracted significant interest recently, being the focus of smart home research projects and commercial voice assistant home devices. Within the multi-module pipelines of such systems, speech activity detection (SAD) constitutes a crucial component, providing input to their activation and speech recognition subsystems. In typical multi-room domestic environments, SAD may also convey spatial intelligence to the interaction, in addition to its traditional temporal segmentation output, by assigning speech activity at the room level. Such room-localized SAD can, for example, disambiguate user command referents, allow localized system feedback, and enable parallel voice interaction sessions by multiple subjects in different rooms. In this paper, we investigate a room-localized SAD system for smart homes equipped with multiple microphones distributed in multiple rooms, significantly extending our earlier work. The system employs a two-stage algorithm, incorporating a set of hand-crafted features specially designed to discriminate room-inside vs. room-outside speech at its second stage, refining SAD hypotheses obtained at its first stage by traditional statistical modeling and acoustic front-end processing. Both algorithmic stages exploit multi-microphone information, combining it at the signal, feature, or decision level. The proposed approach is extensively evaluated on both simulated and real data recorded in a multi-room, multi-microphone smart home, significantly outperforming alternative baselines. Further, it remains robust to reduced microphone setups, while also comparing favorably to deep learning-based alternatives. Abstract Voice-enabled interaction systems in domestic environments have attracted significant interest recently, being the focus of smart home research projects and commercial voice assistant home devices. Within the multi-module pipelines of such systems, speech activity detection (SAD) constitutes a crucial component, providing input to their activation and speech recognition subsystems. In typical multi-room domestic environments, SAD may also convey spatial intelligence to the interaction, in addition to its traditional temporal segmentation output, by assigning speech activity at the room level. Such room-localized SAD can, for example, disambiguate user command referents, allow localized system feedback, and enable parallel voice interaction sessions by multiple subjects in different rooms. In this paper, we investigate a room-localized SAD system for smart homes equipped with multiple microphones distributed in multiple rooms, significantly extending our earlier work. The system employs a two-stage algorithm, incorporating a set of hand-crafted features specially designed to discriminate room-inside vs. room-outside speech at its second stage, refining SAD hypotheses obtained at its first stage by traditional statistical modeling and acoustic front-end processing. Both algorithmic stages exploit multi-microphone information, combining it at the signal, feature, or decision level. The proposed approach is extensively evaluated on both simulated and real data recorded in a multi-room, multi-microphone smart home, significantly outperforming alternative baselines. Further, it remains robust to reduced microphone setups, while also comparing favorably to deep learning-based alternatives. |
ArticleNumber | 15 |
Author | Maragos, Petros Potamianos, Gerasimos Giannoulis, Panagiotis |
Author_xml | – sequence: 1 givenname: Panagiotis orcidid: 0000-0001-5446-0457 surname: Giannoulis fullname: Giannoulis, Panagiotis email: pangian@cs.ntua.gr organization: School of Electrical and Computer Engineering, National Technical University of Athens, Athena Research and Innovation Center – sequence: 2 givenname: Gerasimos surname: Potamianos fullname: Potamianos, Gerasimos organization: Department of Electrical and Computer Engineering, University of Thessaly, Athena Research and Innovation Center – sequence: 3 givenname: Petros surname: Maragos fullname: Maragos, Petros organization: School of Electrical and Computer Engineering, National Technical University of Athens, Athena Research and Innovation Center |
BookMark | eNp9kMtKxDAUhoMoqKMP4K7gupp7MksRbyAIouuQpqdOhrYZk4ygT2_Gioqgi5DDIf8l3z7aHsMICB0RfEKIlqeJMMlkjcm8HKFrvYX2iNSq5orS7R_zLtpPaYmxYILTPXR9H8JQ98HZ3r9BW6UVgFtU1mX_4vNr1UKGMoex8mM1rPvs68G7GFaLkl-lwcZcLcIA6QDtdLZPcPh5z9Dj5cXD-XV9e3d1c352WztORa4ZUC05JkpTAq5pGs4AuGbOYteC0lwp3gliWaOIswpaBQxb0XHnmrl1HZuhm8m3DXZpVtGXCq8mWG8-FiE-mdLJux4MI420oqQJKzkTzbxtHAgFFHdWMk2K1_HktYrheQ0pm2VYx7HUN5RqLKlWheoMkelV-XZKEbqvVILNBr6Z4JsC32zgG1006pfG-Ww3HHO0vv9XSSdlKinjE8TvTn-L3gGCz5sn |
CitedBy_id | crossref_primary_10_1186_s13636_024_00386_y crossref_primary_10_3389_fdata_2024_1419562 |
Cites_doi | 10.1007/978-3-662-04619-7_8 10.1007/978-3-319-56904-8_16 10.1016/j.specom.2018.11.002 10.1109/TSA.2005.858055 10.1109/LSP.2013.2237903 10.1016/j.eswa.2015.02.036 10.4018/jaci.2009062203 10.1109/97.736233 10.1049/ip-i-2.1992.0052 10.1109/TASE.2008.2004965 10.1007/978-3-642-34898-3_14 10.1504/IJCAT.2010.034727 10.1109/TASL.2012.2229986 10.1109/TASL.2010.2052803 10.1109/35.620527 10.1364/JOSAA.12.001867 10.1016/j.maturitas.2011.03.016 10.1109/TSMCC.2012.2189204 10.1016/j.csl.2017.12.002 10.1016/j.maturitas.2009.07.014 10.1145/2738047 10.1109/TASL.2006.872625 10.1007/s12652-015-0270-2 10.1109/TASSP.1977.1162964 10.1109/TASLP.2015.2505415 10.1006/dspr.1999.0361 10.1016/j.specom.2003.10.002 10.1016/j.csl.2017.02.004 |
ContentType | Journal Article |
Copyright | The Author(s) 2019 EURASIP Journal on Audio, Speech, and Music Processing is a copyright of Springer, (2019). All Rights Reserved. © 2019. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
Copyright_xml | – notice: The Author(s) 2019 – notice: EURASIP Journal on Audio, Speech, and Music Processing is a copyright of Springer, (2019). All Rights Reserved. © 2019. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
DBID | C6C AAYXX CITATION 8FE 8FG ABUWG AFKRA ARAPS AZQEC BENPR BGLVJ CCPQU DWQXO HCIFZ P5Z P62 PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PRINS DOA |
DOI | 10.1186/s13636-019-0158-8 |
DatabaseName | Springer Nature OA Free Journals (Freely Accessible) CrossRef ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) ProQuest Central UK/Ireland ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Central (subscription) Technology collection ProQuest One Community College ProQuest Central Korea ProQuest SciTech Premium Collection ProQuest Advanced Technologies & Aerospace Database (NC LIVE) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Premium ProQuest One Academic (New) Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China DOAJ Directory of Open Access Journals |
DatabaseTitle | CrossRef Publicly Available Content Database Advanced Technologies & Aerospace Collection Technology Collection ProQuest One Academic Middle East (New) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest One Academic Eastern Edition ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Technology Collection ProQuest SciTech Collection ProQuest Central China ProQuest Central Advanced Technologies & Aerospace Database ProQuest One Applied & Life Sciences ProQuest One Academic UKI Edition ProQuest Central Korea ProQuest Central (New) ProQuest One Academic ProQuest One Academic (New) |
DatabaseTitleList | Publicly Available Content Database |
Database_xml | – sequence: 1 dbid: C6C name: Springer Nature OA Free Journals url: http://www.springeropen.com/ sourceTypes: Publisher – sequence: 2 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 3 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 1687-4722 |
EndPage | 23 |
ExternalDocumentID | oai_doaj_org_article_31b6a56405a6435b9dbce57e20fa6381 10_1186_s13636_019_0158_8 |
GrantInformation_xml | – fundername: Horizon 2020 Framework Programme grantid: 687831 funderid: http://dx.doi.org/10.13039/100010661 |
GroupedDBID | -A0 .4S .DC 0R~ 29G 2WC 4.4 40G 5GY 5VS 6OB 8FE 8FG 8R4 8R5 AAFWJ AAJSJ AAKKN ABEEZ ACACY ACGFO ACGFS ACULB ADBBV ADINQ ADMLS AENEX AFGXO AFKRA AFPKN AHBYD AHYZX AIAGR ALMA_UNASSIGNED_HOLDINGS AMKLP AMTXH ARAPS ARCSS BAPOH BCNDV BENPR BGLVJ C24 C6C CCPQU CS3 E3Z EBLON EBS EDO EJD GROUPED_DOAJ GX1 HCIFZ I-F KQ8 M~E OK1 P2P P62 PIMPY PROAC Q2X RHU RNS RSV SEG SOJ TUS U2A AASML AAYXX CITATION OVT PHGZM PHGZT ABUWG AZQEC DWQXO PKEHL PQEST PQGLB PQQKQ PQUKI PRINS PUEGO |
ID | FETCH-LOGICAL-c425t-3e2864017821ecbbb43ee483ca0cde784774f51a3b71ca7ed7e30a5f4ccb9acf3 |
IEDL.DBID | C6C |
ISSN | 1687-4722 1687-4714 |
IngestDate | Wed Aug 27 01:24:08 EDT 2025 Fri Jul 25 06:45:01 EDT 2025 Thu Apr 24 23:08:57 EDT 2025 Tue Jul 01 01:47:21 EDT 2025 Fri Feb 21 02:32:04 EST 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 1 |
Keywords | Speech activity detection Microphone arrays Active room selection Multi-channel fusion Smart homes |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c425t-3e2864017821ecbbb43ee483ca0cde784774f51a3b71ca7ed7e30a5f4ccb9acf3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0001-5446-0457 |
OpenAccessLink | https://doi.org/10.1186/s13636-019-0158-8 |
PQID | 2280628736 |
PQPubID | 237298 |
PageCount | 23 |
ParticipantIDs | doaj_primary_oai_doaj_org_article_31b6a56405a6435b9dbce57e20fa6381 proquest_journals_2280628736 crossref_primary_10_1186_s13636_019_0158_8 crossref_citationtrail_10_1186_s13636_019_0158_8 springer_journals_10_1186_s13636_019_0158_8 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2019-08-27 |
PublicationDateYYYYMMDD | 2019-08-27 |
PublicationDate_xml | – month: 08 year: 2019 text: 2019-08-27 day: 27 |
PublicationDecade | 2010 |
PublicationPlace | Cham |
PublicationPlace_xml | – name: Cham – name: New York |
PublicationTitle | EURASIP journal on audio, speech, and music processing |
PublicationTitleAbbrev | J AUDIO SPEECH MUSIC PROC |
PublicationYear | 2019 |
Publisher | Springer International Publishing Springer Nature B.V SpringerOpen |
Publisher_xml | – name: Springer International Publishing – name: Springer Nature B.V – name: SpringerOpen |
References | Chan, Campo, Estève, Fourniols (CR1) 2009; 64 Alam, Reaz, Mohd Ali (CR4) 2012; 42 Rabiner, Juang (CR72) 1993 Maragos, Bovik (CR74) 1995; 12 CR39 Evangelopoulos, Maragos (CR50) 2006; 14 Poland, Nugent, Wang, Chen (CR2) 2009; 1 CR35 CR34 CR33 CR77 CR32 CR30 CR73 CR70 Tucker (CR41) 1992; 139 Filho, Moir (CR8) 2010; 39 Amiribesheli, Benmansour, Bouchachia (CR5) 2015; 6 Principi, Squartini, Bonfigli, Ferroni, Piazza (CR17) 2015; 42 CR6 Reynolds, Quatieri, Dunn (CR23) 2000; 10 CR7 Vesperini, Vecchiotti, Principi, Squartini, Piazza (CR71) 2018; 49 CR9 Ding, Cooper, Pasquina, Fici-Psquina (CR3) 2011; 69 Vacher, Caffiau, Portet, Meillon, Roux, Elias, Lecouteux, Chahuara (CR12) 2015; 7 CR48 CR45 Brdiczka, Langet, Maisonnasse, Crowley (CR69) 2009; 6 Sohn, Kim, Sung (CR36) 1999; 6 CR42 Rodomagoulakis, Katsamanis, Potamianos, Giannoulis, Tsiami, Maragos (CR16) 2017; 46 CR40 Sadjadi, Hansen (CR43) 2013; 20 DiBiase, Silverman, Brandstein, Brandstein, Ward (CR75) 2001 Benyassine, Shlomot, Su, Massaloux, Lamblin, Petit (CR20) 1997; 35 Bertin, Camberlein, Lebarbenchon, Vincent, Sivasankaran, Illina, Bimbot (CR65) 2019; 106 Sehili, Lecouteux, Vacher, Portet, Istrate, Dorizzi, Boudy, Paternò, de Ruyter, Markopoulos, Santoro, van Loenen, Luyten (CR67) 2012 CR19 CR18 CR15 CR59 CR14 CR58 CR57 CR56 CR11 Ma, Nishihara (CR47) 2013; 2013 CR55 CR10 CR54 Mesgarani, Slaney, Shamma (CR49) 2006; 14 CR51 Vecchiotti, Vesperini, Principi, Squartini, Piazza, Esposito, Faudez-Zanuy, Morabito, Pasero (CR31) 2018 Zhang, Wang (CR53) 2016; 24 Ramírez, Segura, Benítez, de la Torre, Rubio (CR38) 2004; 42 Ghosh, Tsiartas, Narayanan (CR46) 2011; 19 CR29 CR28 CR27 CR26 Rabiner, Sambur (CR44) 1977; 25 CR25 CR24 CR68 CR22 CR66 CR21 CR64 CR63 CR62 Malavasi, Turri, Atria, Christensen, Marxer, Desideri, Coy, Tamburini, Green (CR13) 2017; 242 CR61 CR60 Graf, Herbig, Buck, Schmidt (CR37) 2015; 2015 Zhang, Wu (CR52) 2013; 21 Theodoridis, Koutroumbas (CR76) 2009 S. O. Sadjadi (158_CR43) 2013; 20 M. P. Poland (158_CR2) 2009; 1 M. R. Alam (158_CR4) 2012; 42 158_CR18 158_CR19 M. Vacher (158_CR12) 2015; 7 158_CR25 158_CR24 G. Evangelopoulos (158_CR50) 2006; 14 158_CR68 J. H. DiBiase (158_CR75) 2001 158_CR27 158_CR26 158_CR21 158_CR64 O. Brdiczka (158_CR69) 2009; 6 158_CR22 P. K. Ghosh (158_CR46) 2011; 19 158_CR66 158_CR61 158_CR60 M. A. Sehili (158_CR67) 2012 158_CR63 158_CR62 J. Ramírez (158_CR38) 2004; 42 F. Vesperini (158_CR71) 2018; 49 M. Amiribesheli (158_CR5) 2015; 6 A. Benyassine (158_CR20) 1997; 35 158_CR14 158_CR58 158_CR57 158_CR15 158_CR59 158_CR10 158_CR54 S. Theodoridis (158_CR76) 2009 D. A. Reynolds (158_CR23) 2000; 10 J. Sohn (158_CR36) 1999; 6 158_CR56 158_CR11 158_CR55 158_CR51 158_CR9 158_CR7 158_CR6 M. Malavasi (158_CR13) 2017; 242 M. Chan (158_CR1) 2009; 64 P. Vecchiotti (158_CR31) 2018 E. Principi (158_CR17) 2015; 42 158_CR39 L. R. Rabiner (158_CR44) 1977; 25 158_CR48 X. -L. Zhang (158_CR53) 2016; 24 158_CR42 158_CR45 L. Rabiner (158_CR72) 1993 Y. Ma (158_CR47) 2013; 2013 158_CR40 I. Rodomagoulakis (158_CR16) 2017; 46 S. Graf (158_CR37) 2015; 2015 G. L. Filho (158_CR8) 2010; 39 N. Bertin (158_CR65) 2019; 106 P. Maragos (158_CR74) 1995; 12 158_CR29 158_CR28 D. Ding (158_CR3) 2011; 69 R. Tucker (158_CR41) 1992; 139 158_CR35 158_CR32 158_CR34 158_CR33 158_CR77 158_CR30 N. Mesgarani (158_CR49) 2006; 14 158_CR73 X. -L. Zhang (158_CR52) 2013; 21 158_CR70 |
References_xml | – ident: CR45 – ident: CR70 – ident: CR22 – ident: CR68 – year: 1993 ident: CR72 publication-title: Fundamentals of Speech Recognition – start-page: 157 year: 2001 end-page: 180 ident: CR75 article-title: Robust localization in reverberant rooms publication-title: Microphone Arrays: Signal Processing Techniques and Applications doi: 10.1007/978-3-662-04619-7_8 – ident: CR39 – ident: CR51 – start-page: 161 year: 2018 end-page: 170 ident: CR31 article-title: Convolutional neural networks with 3-D kernels for voice activity detection in a multiroom environment publication-title: Multidisciplinary Approaches to Neural Computing, vol. SIST-69 doi: 10.1007/978-3-319-56904-8_16 – volume: 106 start-page: 68 year: 2019 end-page: 78 ident: CR65 article-title: VoiceHome-2, an extended corpus for multichannel speech processing in real homes publication-title: Speech Comm. doi: 10.1016/j.specom.2018.11.002 – ident: CR35 – ident: CR29 – ident: CR54 – ident: CR61 – ident: CR77 – volume: 14 start-page: 920 issue: 3 year: 2006 end-page: 930 ident: CR49 article-title: Discrimination of speech from nonspeech based on multiscale spectro-temporal modulations publication-title: IEEE Trans. Audio Speech Lang. Process. doi: 10.1109/TSA.2005.858055 – ident: CR58 – ident: CR25 – ident: CR42 – volume: 20 start-page: 197 issue: 3 year: 2013 end-page: 200 ident: CR43 article-title: Unsupervised speech activity detection using voicing measures and perceptual spectral flux publication-title: IEEE Signal Process. Lett. doi: 10.1109/LSP.2013.2237903 – ident: CR21 – volume: 2015 start-page: 1 issue: 91 year: 2015 end-page: 15 ident: CR37 article-title: Features for voice activity detection: a comparative analysis publication-title: EURASIP J. Adv. Signal Process. – ident: CR19 – ident: CR15 – volume: 42 start-page: 5668 issue: 13 year: 2015 end-page: 5683 ident: CR17 article-title: An integrated system for voice command recognition and emergency detection based on audio signals publication-title: Expert Syst. Appl. doi: 10.1016/j.eswa.2015.02.036 – volume: 1 start-page: 32 issue: 4 year: 2009 end-page: 45 ident: CR2 article-title: Smart home research: projects and issues publication-title: Int. J. Ambient Comput. Intell. doi: 10.4018/jaci.2009062203 – ident: CR11 – ident: CR9 – ident: CR57 – ident: CR32 – volume: 6 start-page: 1 issue: 1 year: 1999 end-page: 3 ident: CR36 article-title: A statistical model-based voice activity detection publication-title: IEEE Signal Process. Lett. doi: 10.1109/97.736233 – volume: 139 start-page: 377 issue: 4 year: 1992 end-page: 380 ident: CR41 article-title: Voice activity detection using a periodicity measure publication-title: IEEE Proc. I Commun. Speech Vis. doi: 10.1049/ip-i-2.1992.0052 – ident: CR60 – volume: 6 start-page: 588 issue: 4 year: 2009 end-page: 597 ident: CR69 article-title: Detecting human behavior models from multimodal observation in a smart home publication-title: IEEE Trans. Autom. Sci. Eng. doi: 10.1109/TASE.2008.2004965 – ident: CR64 – ident: CR26 – year: 2009 ident: CR76 publication-title: Pattern Recognition – ident: CR18 – ident: CR66 – start-page: 208 year: 2012 end-page: 223 ident: CR67 article-title: Sound environment analysis in smart home publication-title: Ambient Intelligence: Third International Joint Conference, AmI 2012 Proceedings, vol. LNCS-7683 doi: 10.1007/978-3-642-34898-3_14 – volume: 39 start-page: 32 issue: 1/2/3 year: 2010 end-page: 39 ident: CR8 article-title: From science fiction to science fact: a smart-house interface using speech technology and a photo-realistic avatar publication-title: Int. J. Comput. Appl. Technol. doi: 10.1504/IJCAT.2010.034727 – volume: 242 start-page: 306 year: 2017 end-page: 313 ident: CR13 article-title: An innovative speech-based user interface for smarthomes and IoT solutions to help people with speech and motor disabilities publication-title: Stud. Health Technol. Inform. – volume: 21 start-page: 697 issue: 4 year: 2013 end-page: 710 ident: CR52 article-title: Deep belief networks based voice activity detection publication-title: IEEE Trans. Audio Speech Lang. Process. doi: 10.1109/TASL.2012.2229986 – ident: CR14 – ident: CR30 – volume: 2013 start-page: 1 issue: 21 year: 2013 end-page: 18 ident: CR47 article-title: Efficient voice activity detection algorithm using long-term spectral flatness measure publication-title: EURASIP J. Audio Speech Music Process. – ident: CR10 – ident: CR33 – volume: 19 start-page: 600 issue: 3 year: 2011 end-page: 613 ident: CR46 article-title: Robust voice activity detection using long-term signal variability publication-title: IEEE Trans. Audio Speech Lang. Process. doi: 10.1109/TASL.2010.2052803 – volume: 35 start-page: 64 issue: 9 year: 1997 end-page: 73 ident: CR20 article-title: ITU-T Recommendation G.729 Annex B: a silence compression scheme for use with G.729 optimized for V.70 digital simultaneous voice and data applications publication-title: IEEE Commun. Mag. doi: 10.1109/35.620527 – ident: CR6 – ident: CR56 – volume: 12 start-page: 1867 issue: 9 year: 1995 end-page: 1876 ident: CR74 article-title: Image demodulation using multidimensional energy separation publication-title: J. Opt. Soc. Am. A doi: 10.1364/JOSAA.12.001867 – volume: 69 start-page: 131 issue: 2 year: 2011 end-page: 136 ident: CR3 article-title: Sensor technology for smart homes publication-title: Maturitas doi: 10.1016/j.maturitas.2011.03.016 – ident: CR40 – ident: CR63 – volume: 42 start-page: 1190 issue: 6 year: 2012 end-page: 1203 ident: CR4 article-title: A review of smart homes – past, present, and future publication-title: IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. doi: 10.1109/TSMCC.2012.2189204 – ident: CR27 – volume: 49 start-page: 83 year: 2018 end-page: 106 ident: CR71 article-title: Localizing speakers in multiple rooms by using deep neural networks publication-title: Comput. Speech Lang. doi: 10.1016/j.csl.2017.12.002 – ident: CR48 – ident: CR73 – volume: 64 start-page: 90 issue: 2 year: 2009 end-page: 97 ident: CR1 article-title: Smart homes – current features and future perspectives publication-title: Maturitas doi: 10.1016/j.maturitas.2009.07.014 – volume: 7 start-page: 1 issue: 2:5 year: 2015 end-page: 36 ident: CR12 article-title: Evaluation of a context-aware voice interface for ambient assisted living: qualitative user study vs. quantitative system evaluation publication-title: ACM Trans. Accessible Comput. doi: 10.1145/2738047 – volume: 14 start-page: 2024 issue: 6 year: 2006 end-page: 2038 ident: CR50 article-title: Multiband modulation energy tracking for noisy speech detection publication-title: IEEE Trans. Audio Speech Lang. Process. doi: 10.1109/TASL.2006.872625 – volume: 6 start-page: 495 issue: 4 year: 2015 end-page: 517 ident: CR5 article-title: A review of smart homes in healthcare publication-title: J. Ambient Intell. Humanized Comput. doi: 10.1007/s12652-015-0270-2 – ident: CR34 – volume: 25 start-page: 338 issue: 4 year: 1977 end-page: 343 ident: CR44 article-title: Application of an LPC distance measure to the voiced-unvoiced-silence detection problem publication-title: IEEE Trans. Acoust. Speech Signal Proc. doi: 10.1109/TASSP.1977.1162964 – volume: 24 start-page: 252 issue: 2 year: 2016 end-page: 264 ident: CR53 article-title: Boosting contextual information for deep neural network based voice activity detection publication-title: IEEE/ACM Trans. Audio Speech Lang. Process. doi: 10.1109/TASLP.2015.2505415 – ident: CR55 – ident: CR7 – ident: CR59 – volume: 10 start-page: 19 issue: 1 year: 2000 end-page: 41 ident: CR23 article-title: Speaker verification using adapted Gaussian mixture models publication-title: Digit. Signal Process. doi: 10.1006/dspr.1999.0361 – ident: CR28 – volume: 42 start-page: 271 issue: 3–4 year: 2004 end-page: 287 ident: CR38 article-title: Efficient voice activity detection algorithms using long-term speech information publication-title: Speech Comm. doi: 10.1016/j.specom.2003.10.002 – ident: CR62 – volume: 46 start-page: 419 year: 2017 end-page: 443 ident: CR16 article-title: Room-localized spoken command recognition in multi-room, multi-microphone environments publication-title: Comput. Speech Lang. doi: 10.1016/j.csl.2017.02.004 – ident: CR24 – ident: 158_CR19 – volume: 14 start-page: 2024 issue: 6 year: 2006 ident: 158_CR50 publication-title: IEEE Trans. Audio Speech Lang. Process. doi: 10.1109/TASL.2006.872625 – start-page: 208 volume-title: Ambient Intelligence: Third International Joint Conference, AmI 2012 Proceedings, vol. LNCS-7683 year: 2012 ident: 158_CR67 doi: 10.1007/978-3-642-34898-3_14 – ident: 158_CR25 – ident: 158_CR21 – ident: 158_CR15 – ident: 158_CR54 – ident: 158_CR73 – ident: 158_CR77 – ident: 158_CR34 – volume: 139 start-page: 377 issue: 4 year: 1992 ident: 158_CR41 publication-title: IEEE Proc. I Commun. Speech Vis. doi: 10.1049/ip-i-2.1992.0052 – volume: 6 start-page: 1 issue: 1 year: 1999 ident: 158_CR36 publication-title: IEEE Signal Process. Lett. doi: 10.1109/97.736233 – ident: 158_CR57 – ident: 158_CR11 – ident: 158_CR30 – ident: 158_CR40 – ident: 158_CR7 – ident: 158_CR29 – ident: 158_CR63 – volume: 2013 start-page: 1 issue: 21 year: 2013 ident: 158_CR47 publication-title: EURASIP J. Audio Speech Music Process. – volume: 1 start-page: 32 issue: 4 year: 2009 ident: 158_CR2 publication-title: Int. J. Ambient Comput. Intell. doi: 10.4018/jaci.2009062203 – volume: 46 start-page: 419 year: 2017 ident: 158_CR16 publication-title: Comput. Speech Lang. doi: 10.1016/j.csl.2017.02.004 – volume: 35 start-page: 64 issue: 9 year: 1997 ident: 158_CR20 publication-title: IEEE Commun. Mag. doi: 10.1109/35.620527 – volume: 42 start-page: 271 issue: 3–4 year: 2004 ident: 158_CR38 publication-title: Speech Comm. doi: 10.1016/j.specom.2003.10.002 – ident: 158_CR48 – volume-title: Pattern Recognition year: 2009 ident: 158_CR76 – volume: 6 start-page: 495 issue: 4 year: 2015 ident: 158_CR5 publication-title: J. Ambient Intell. Humanized Comput. doi: 10.1007/s12652-015-0270-2 – volume: 10 start-page: 19 issue: 1 year: 2000 ident: 158_CR23 publication-title: Digit. Signal Process. doi: 10.1006/dspr.1999.0361 – ident: 158_CR45 – ident: 158_CR18 – volume: 12 start-page: 1867 issue: 9 year: 1995 ident: 158_CR74 publication-title: J. Opt. Soc. Am. A doi: 10.1364/JOSAA.12.001867 – ident: 158_CR24 – volume: 42 start-page: 5668 issue: 13 year: 2015 ident: 158_CR17 publication-title: Expert Syst. Appl. doi: 10.1016/j.eswa.2015.02.036 – ident: 158_CR39 – ident: 158_CR51 – start-page: 161 volume-title: Multidisciplinary Approaches to Neural Computing, vol. SIST-69 year: 2018 ident: 158_CR31 doi: 10.1007/978-3-319-56904-8_16 – ident: 158_CR55 – volume: 64 start-page: 90 issue: 2 year: 2009 ident: 158_CR1 publication-title: Maturitas doi: 10.1016/j.maturitas.2009.07.014 – ident: 158_CR10 – ident: 158_CR35 – ident: 158_CR14 – volume: 39 start-page: 32 issue: 1/2/3 year: 2010 ident: 158_CR8 publication-title: Int. J. Comput. Appl. Technol. doi: 10.1504/IJCAT.2010.034727 – ident: 158_CR6 – ident: 158_CR28 – ident: 158_CR62 – volume: 20 start-page: 197 issue: 3 year: 2013 ident: 158_CR43 publication-title: IEEE Signal Process. Lett. doi: 10.1109/LSP.2013.2237903 – ident: 158_CR66 – volume: 69 start-page: 131 issue: 2 year: 2011 ident: 158_CR3 publication-title: Maturitas doi: 10.1016/j.maturitas.2011.03.016 – volume: 242 start-page: 306 year: 2017 ident: 158_CR13 publication-title: Stud. Health Technol. Inform. – start-page: 157 volume-title: Microphone Arrays: Signal Processing Techniques and Applications year: 2001 ident: 158_CR75 doi: 10.1007/978-3-662-04619-7_8 – volume: 24 start-page: 252 issue: 2 year: 2016 ident: 158_CR53 publication-title: IEEE/ACM Trans. Audio Speech Lang. Process. doi: 10.1109/TASLP.2015.2505415 – ident: 158_CR56 – volume-title: Fundamentals of Speech Recognition year: 1993 ident: 158_CR72 – ident: 158_CR59 – volume: 42 start-page: 1190 issue: 6 year: 2012 ident: 158_CR4 publication-title: IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. doi: 10.1109/TSMCC.2012.2189204 – ident: 158_CR61 – ident: 158_CR9 – ident: 158_CR32 – ident: 158_CR27 – ident: 158_CR42 – ident: 158_CR22 – volume: 6 start-page: 588 issue: 4 year: 2009 ident: 158_CR69 publication-title: IEEE Trans. Autom. Sci. Eng. doi: 10.1109/TASE.2008.2004965 – ident: 158_CR70 – volume: 21 start-page: 697 issue: 4 year: 2013 ident: 158_CR52 publication-title: IEEE Trans. Audio Speech Lang. Process. doi: 10.1109/TASL.2012.2229986 – volume: 25 start-page: 338 issue: 4 year: 1977 ident: 158_CR44 publication-title: IEEE Trans. Acoust. Speech Signal Proc. doi: 10.1109/TASSP.1977.1162964 – volume: 2015 start-page: 1 issue: 91 year: 2015 ident: 158_CR37 publication-title: EURASIP J. Adv. Signal Process. – volume: 19 start-page: 600 issue: 3 year: 2011 ident: 158_CR46 publication-title: IEEE Trans. Audio Speech Lang. Process. doi: 10.1109/TASL.2010.2052803 – ident: 158_CR58 – ident: 158_CR33 – volume: 14 start-page: 920 issue: 3 year: 2006 ident: 158_CR49 publication-title: IEEE Trans. Audio Speech Lang. Process. doi: 10.1109/TSA.2005.858055 – volume: 49 start-page: 83 year: 2018 ident: 158_CR71 publication-title: Comput. Speech Lang. doi: 10.1016/j.csl.2017.12.002 – ident: 158_CR60 – volume: 7 start-page: 1 issue: 2:5 year: 2015 ident: 158_CR12 publication-title: ACM Trans. Accessible Comput. doi: 10.1145/2738047 – ident: 158_CR64 – ident: 158_CR26 – volume: 106 start-page: 68 year: 2019 ident: 158_CR65 publication-title: Speech Comm. doi: 10.1016/j.specom.2018.11.002 – ident: 158_CR68 |
SSID | ssj0053542 ssib044736451 ssib008501525 |
Score | 2.1272976 |
Snippet | Voice-enabled interaction systems in domestic environments have attracted significant interest recently, being the focus of smart home research projects and... Abstract Voice-enabled interaction systems in domestic environments have attracted significant interest recently, being the focus of smart home research... |
SourceID | doaj proquest crossref springer |
SourceType | Open Website Aggregation Database Enrichment Source Index Database Publisher |
StartPage | 1 |
SubjectTerms | Acoustics Active room selection Algorithms Computer simulation Engineering Engineering Acoustics Machine learning Mathematics in Music Microphone arrays Microphones Multi-channel fusion Research projects Segmentation Signal,Image and Speech Processing Smart buildings Smart homes Smart houses Speech activity detection Speech recognition Statistical models Subsystems Voice activity detectors |
SummonAdditionalLinks | – databaseName: DOAJ Directory of Open Access Journals dbid: DOA link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3PS8MwFA7iSQ_iT5xOycGTUtY0TdIdVRxD0IM42C0k6QsbuDnsvPjX-5J2cxPUi9c2oemXl3zvIy_vEXLhvQvXJ30ivUWBInDN2VzxBGxuXMm8yaJQfHiU_UF-PxTDlVJfISasTg9cA9fhzEojJPoVBslT2G5pHQgFWeoN2k4UPsh5CzFV78GCizxrzjBZITsV45IH5Rxig0SRFGssFJP1r3mY3w5FI9f0dslO4yTS63pwe2QDpvtkeyV14AHpP6HHm0QmGn9ASasZgBvRcE0hVIOgJcxjkNWUjqc0Rg0mkxB7F0LRgVYT_HE6ep1AdUgGvbvn237SVEVIHK6vecIhKxAOhtTOwFlrcw6QF9yZ1JWgkG1U7gUz3CrmjIJSAU-N8Llztmuc50dkc4qfOibUlgHSjNvSoExUmQXke25sl4X7sCptkXSBknZNyvBQueJFR-lQSF0DqxFYHYDVRYtcLrvM6nwZvzW-CdAvG4ZU1_EBGoBuDED_ZQAt0l5MnG7WX6VDkh-JYpDLFrlaTObX6x9HdPIfIzolW1k0NdyDVJtszt_e4Qxdl7k9j1b6CSfZ6Ls priority: 102 providerName: Directory of Open Access Journals – databaseName: ProQuest Central (subscription) dbid: BENPR link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1Lb9QwELZge4ED4ikWCvKBE8hqHD_3hChqtUKiQhWVerP8GNNK7KPNcuHXM_Y6W4pEj0mcxB7P4xt7PEPIu5xjOT6Zmc4BHRSFMhekEQyC9DHx7PvqKH490fMz-eVcnbcFt6GFVY46sSrqtIpljfygpG3RCO-F_ri-YqVqVNldbSU07pM9VMFWTcje4dHJt9MdR1nVlQI_47XEbmipdkEgSqhaXodrFDVU07Lte3KrDwYutCjedoknUpbZW5arJvi_hUr_2Uit9un4MXnUgCX9tOWEJ-QeLJ-Sh3-lG3xG5qeIklm1Xpe_IdFhDRAvaDnaUCpI0ASbGpi1pJdLWiMN2aLE65XwdaDDArmMXqwWMDwnZ8dH3z_PWaukwCLK5IYJ6K1GTwrhAIcYQpACQFoRfRcTGLRQRmbFvQiGR28gGRCdV1nGGGY-ZvGCTJb4q5eEhoQYBr3XkDy6lqYPgBhB-DDj5Qyt6aakG6nkYkszXqpd_HTV3bDabQnrkLCuENbZKXm_e2W9zbFxV-PDQvpdw5Ieu95YXf9wTdqc4EF7hSNWvvQ2zFKIoAz0XfaocPiU7I8T55rMDu6Gw6bkwziZN4__26NXd3_sNXnQVyZCjWT2yWRz_QveIJDZhLeNW_8AgiDrAA priority: 102 providerName: ProQuest |
Title | Room-localized speech activity detection in multi-microphone smart homes |
URI | https://link.springer.com/article/10.1186/s13636-019-0158-8 https://www.proquest.com/docview/2280628736 https://doaj.org/article/31b6a56405a6435b9dbce57e20fa6381 |
Volume | 2019 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTxsxELZ4XNoDAtqqARr5wKnVquv1a3OEKCFCAiHUSLlZtncskJoUddNLD_x2ZpwNhapF4jLS7nq13hmPZz7Nw4wdpxSpfDIVJgUEKBp1LigrCwjKx0YkX2WgeHFpJlN1PtOzrlk01cI8jd-L2nxthTSSMC9l9ei6qDfZtqY2YxSXNcP1pqulVlUXtPzna8_MTu7O_8yl_CsKmo3LeJftdF4hP1mJcY9twGKfvX3SK_Adm1yji1tk03P7Gxre3gHEG051CXT8A29gmbOqFvx2wXOaYDGnZDvKPQfeznGJ8Jsfc2jfs-l49G04KbpjEIqICrUsJFS1QRiEtlxADCEoCaBqGX0ZG7BoXqxKWngZrIjeQmNBll4nFWMY-JjkB7a1wE99ZDw06IAg9AyNR1xoqwBo4KUPA0EFsLbssXLNJRe7HuF0VMV3l7FCbdyKsQ4Z64ixru6xz4-v3K0aZLw0-JRY_ziQelvnGyhy16mKkyIYr_GPtafZhkETImgLVZk87haix47WgnOdwrWOuvoYRH_S9NiXtTD_PP7vjA5eNfqQvanymsLdxR6xreXPX_AJnZJl6LNNVZ4hrcdIt09Hl1fXeDWsVD8v1H4G-0jPZgLpxf0I6bQ6eQCFFeEy |
linkProvider | Springer Nature |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwELZKewAOiKe6UMAHuICixvEre0BVC622tF2hqpV6M7YzoZXYB80iBD-qv7Ez3mRLkeitxySOY49nxjPxzHyMvanrSOmTdWbqgA6KRpkLysoMgvKxErUvkqN4MDSDY_X5RJ8ssYsuF4bCKjudmBR1NYn0j3ydyrYYNO-l2Zj-yAg1ik5XOwiNOVvswe9f6LI1H3Y_4fq-LYqd7aOPg6xFFcgi8ucsk1CURhEqfSEghhCUBFCljD6PFVjU1lbVWngZrIjeQmVB5l7XKsbQ97GW2O8dtoJmRh-laGVre_jlcMHBpc4JUKi7Vjhto_Qi6ERLneB8hEHRxm1BteesojTrjZBGkndP8Uu6zMprO2UCFLhmBf9zcJv2w52H7EFryPLNOec9Ykswfszu_1Xe8AkbHKJVnqXd8uwPVLyZAsRTTqkUhFjBK5ilQLAxPxvzFNmYjSg-kMLlgTcj5Gp-OhlB85Qd3wqNn7HlMX5qlfFQoc2E3nKoPLqytgiANon0oS8oZ9fmPZZ3VHKxLWtO6BrfXXJvSuPmhHVIWEeEdWWPvVu8Mp3X9Lip8RaRftGQynGnG5Pzb66VbidFMF7jjLWn0YZ-FSJoC0Vee1RwosfWuoVzrY5o3BVH99j7bjGvHv93RM9v7uw1uzs4Oth3-7vDvRfsXpEYCrWhXWPLs_Of8BKNqFl41XIuZ19vW1guARhwKYw |
linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwEB6VrYTggHiKhQI5wAUUbRy_sgeEKO1qS2FVVVTqzdjOmFZiHzSLEPw0fh1jb7KlSPTWYxInccbzjGfmA3gego_lkyFXwVGAIknmnNA8Ryesr1mwZQoUP07U-Ei8P5bHG_C7q4WJaZWdTkyKup77-I98ENu2KHLvuRqENi3iYGf0ZvEtjwhScae1g9NYscg-_vxB4Vvzem-H1vpFWY52P70b5y3CQO6JV5c5x7JSIiLUlwy9c05wRFFxbwtfoybNrUWQzHKnmbcaa428sDII793Q-sDpuddgU5NVrHqwub07OThcc3Mliwgu1B0LIoEScp2AIrlM0D5MkZiTiRDtniur1KBhXPEY6cdcJlnl1QWrmcAFLnjE_2ziJts4ug23Wqc2e7viwjuwgbO7cPOvVof3YHxIHnqeLOfpL6yzZoHoT7JYVhHRK7IalykpbJadzrKU5ZhPY65gTJ3HrJkSh2cn8yk29-HoSmj8AHozetVDyFxN_hNFzq62FNbq0iH5J9y6IYv1u7roQ9FRyfi2xXlE2vhqUqhTKbMirCHCmkhYU_Xh5fqWxaq_x2WDtyPp1wNja-50Yn72xbSSbjhzykr6YmnjbN2wdh6lxrIIlpQd68NWt3Cm1ReNOefuPrzqFvP88n9n9Ojyhz2D6yQk5sPeZP8x3CgTP5Fi1FvQW559xyfkTy3d05ZxM_h81bLyBwbcLbg |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Room-localized+speech+activity+detection+in+multi-microphone+smart+homes&rft.jtitle=EURASIP+journal+on+audio%2C+speech%2C+and+music+processing&rft.au=Giannoulis%2C+Panagiotis&rft.au=Potamianos%2C+Gerasimos&rft.au=Maragos%2C+Petros&rft.date=2019-08-27&rft.pub=Springer+International+Publishing&rft.eissn=1687-4722&rft.volume=2019&rft.issue=1&rft_id=info:doi/10.1186%2Fs13636-019-0158-8&rft.externalDocID=10_1186_s13636_019_0158_8 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1687-4722&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1687-4722&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1687-4722&client=summon |