Keyword identification framework for speech communication on construction sites
Worksite communication is a key to boosting teamwork and improving worker performance on the construction worksite. Communication among workers on the construction site mostly consists of speech communication. However, construction sites are typically noisy due to construction tasks like drilling an...
Saved in:
Published in | Modular and Offsite Construction (MOC) Summit Proceedings pp. 106 - 113 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
14.09.2022
|
Online Access | Get full text |
Cover
Loading…
Abstract | Worksite communication is a key to boosting teamwork and improving worker performance on the construction worksite. Communication among workers on the construction site mostly consists of speech communication. However, construction sites are typically noisy due to construction tasks like drilling and operation of heavy equipment. Meanwhile, workers on construction sites typically represent a range of different ethnic and linguistic backgrounds and have different speaking accents. This can make it difficult for the listener to understand the speaker clearly, leading to miscommunication and errors in decision making on the construction site. Technological advancements in recent years can be leveraged to mitigate this problem. In this paper, a keyword identification framework is developed for speech communication on the construction site. For this framework, 12 hours of raw audio data containing 18 crane signalman speech commands (referred to as “keywords”) are collected. The crane signalman uses specific keywords to communicate with the crane operator and guide the crane operator in the crane operations. The 2-second audio clips (this being the approximate duration of each keyword) are extracted from the raw audio dataset, and construction site noise is added. Moreover, mel-frequency cepstral coefficients are extracted from the waveform audio dataset. The extracted mel-frequency cepstral coefficients, in turn, are used to train the 1-dimensional convolutional neural network. After training, the model is found to achieve a training accuracy of 97.3%, a validation accuracy of 96.1%, and a testing accuracy of 93.8%. The model is further deployed for real-time identification of keywords in speech, with the model achieving an accuracy of 95.3%. In light of these findings, it can be concluded that the developed framework is suitable for real-time application in noisy construction sites for identifying specific keywords in speech. |
---|---|
AbstractList | Worksite communication is a key to boosting teamwork and improving worker performance on the construction worksite. Communication among workers on the construction site mostly consists of speech communication. However, construction sites are typically noisy due to construction tasks like drilling and operation of heavy equipment. Meanwhile, workers on construction sites typically represent a range of different ethnic and linguistic backgrounds and have different speaking accents. This can make it difficult for the listener to understand the speaker clearly, leading to miscommunication and errors in decision making on the construction site. Technological advancements in recent years can be leveraged to mitigate this problem. In this paper, a keyword identification framework is developed for speech communication on the construction site. For this framework, 12 hours of raw audio data containing 18 crane signalman speech commands (referred to as “keywords”) are collected. The crane signalman uses specific keywords to communicate with the crane operator and guide the crane operator in the crane operations. The 2-second audio clips (this being the approximate duration of each keyword) are extracted from the raw audio dataset, and construction site noise is added. Moreover, mel-frequency cepstral coefficients are extracted from the waveform audio dataset. The extracted mel-frequency cepstral coefficients, in turn, are used to train the 1-dimensional convolutional neural network. After training, the model is found to achieve a training accuracy of 97.3%, a validation accuracy of 96.1%, and a testing accuracy of 93.8%. The model is further deployed for real-time identification of keywords in speech, with the model achieving an accuracy of 95.3%. In light of these findings, it can be concluded that the developed framework is suitable for real-time application in noisy construction sites for identifying specific keywords in speech. |
Author | Hassan, Imran Ali, Ghulam Muhammad Bouferguene, Ahmed Al-Hussein, Mohamed Mansoor, Asif Liu, Shuai |
Author_xml | – sequence: 1 givenname: Asif surname: Mansoor fullname: Mansoor, Asif – sequence: 2 givenname: Shuai surname: Liu fullname: Liu, Shuai – sequence: 3 givenname: Ghulam Muhammad surname: Ali fullname: Ali, Ghulam Muhammad – sequence: 4 givenname: Ahmed surname: Bouferguene fullname: Bouferguene, Ahmed – sequence: 5 givenname: Mohamed surname: Al-Hussein fullname: Al-Hussein, Mohamed – sequence: 6 givenname: Imran surname: Hassan fullname: Hassan, Imran |
BookMark | eNpNUM1KAzEYDNKCtS34CHv0svol2SSboxT_sNBL70s2-YJBNynJFunbu9QKwsDMMMMc5obMYopIyC2Fe6ap4g9DsoUpekUWTEhWi4a3s3_6mqxLCT00vGGtZHpBdu94-k7ZVcFhHIMP1owhxcpnM-AUfFY-5aocEO1HZdMwHONfZYJNsYz5aM--hBHLisy9-Sq4vvCS7J-f9pvXert7eds8bmsrFa2lgN61rtdKMMEMB-1AGM17x7y3QlpQ0DeukR5Ma1ruLVWgGSJ4pnEqLsnd76zNqZSMvjvkMJh86ih05yu6yxX8B1uYVW0 |
ContentType | Journal Article |
DBID | AAYXX CITATION |
DOI | 10.29173/mocs271 |
DatabaseName | CrossRef |
DatabaseTitle | CrossRef |
DatabaseTitleList | CrossRef |
DeliveryMethod | fulltext_linktorsrc |
EISSN | 2562-5438 |
EndPage | 113 |
ExternalDocumentID | 10_29173_mocs271 |
GroupedDBID | AAYXX ALMA_UNASSIGNED_HOLDINGS CITATION M48 |
ID | FETCH-LOGICAL-c671-650bd8db975252a309d05a93bd2ffc56c070b4d46f0a8a83fc17092ee0f29ea93 |
ISSN | 2562-5438 |
IngestDate | Tue Jul 01 01:29:13 EDT 2025 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Language | English |
License | http://creativecommons.org/licenses/by-nc-nd/4.0 |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-c671-650bd8db975252a309d05a93bd2ffc56c070b4d46f0a8a83fc17092ee0f29ea93 |
OpenAccessLink | https://journalofindustrializedconstruction.com/index.php/mocs/article/download/271/235 |
PageCount | 8 |
ParticipantIDs | crossref_primary_10_29173_mocs271 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2022-09-14 |
PublicationDateYYYYMMDD | 2022-09-14 |
PublicationDate_xml | – month: 09 year: 2022 text: 2022-09-14 day: 14 |
PublicationDecade | 2020 |
PublicationTitle | Modular and Offsite Construction (MOC) Summit Proceedings |
PublicationYear | 2022 |
SSID | ssib043428629 ssib039117972 |
Score | 2.1951883 |
Snippet | Worksite communication is a key to boosting teamwork and improving worker performance on the construction worksite. Communication among workers on the... |
SourceID | crossref |
SourceType | Index Database |
StartPage | 106 |
Title | Keyword identification framework for speech communication on construction sites |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9NAEF6F9MIFFQHirUXiVrk4-7C9x6gCKpDJJaDeon2SHBxXTSxET_x0Zh9xnKiHghRZ0ca7SXY-7Tw83wxC74mzgjvLsooqkTFiqkwZSjKd61xyl1PHPFG4_lZcfmdfrvjVaPRnkLXUbdW5vr2TV_I_UoUxkKtnyf6DZPtFYQDeg3zhChKG671k_NX-_uXpfiuTcn5S5uAu4SrkEG6urdU-vXzABPGPCHS7Lx575p8hb4aGat2akKDq4-oz5_znoblnPwMs03p24YMKPpC22kbKQVCFvZlegx5sYzwAdsf1yT-rLgRdl51c9YCLNO3PS_jW5qzulrJppOmDBW3n7M3Pzqb467JJnKwUrwBX1_dbYPtjDWws330h1nQ5t3eMxaN0khcDrTyJjNXjA5-At-krTzSt3pDYy-WwpvaRruszEMH3CXMXaeYDdELA0SBjdDKdzuY_dmcSFb5m3p5bzCj4a0Voftf_6FjTOCz3IS03sHIG5sr8FD1KfgaeRtA8RiO7foJmCTD4EDC4BwwGwOAIGHwAGAyvIWBwAMxTNP_0cX5xmaWOGpkuykkG1rgylVGi5IQTSXNhci4FVYY4p3mh4fxXzLDC5bKSFXV6UuaCWJs7Iizc-AyN1-3aPkdYlQRMQylAYRDGtRWypIZw7Q1yJZ19gd7tNmBxHeumLI73_OU97nmFHu5B9BqN4W_aN2AGbtXbJKm_jbVj3g |
linkProvider | Canadian Research Knowledge Network |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Keyword+identification+framework+for+speech+communication+on+construction+sites&rft.jtitle=Modular+and+Offsite+Construction+%28MOC%29+Summit+Proceedings&rft.au=Mansoor%2C+Asif&rft.au=Liu%2C+Shuai&rft.au=Ali%2C+Ghulam+Muhammad&rft.au=Bouferguene%2C+Ahmed&rft.date=2022-09-14&rft.issn=2562-5438&rft.eissn=2562-5438&rft.spage=106&rft.epage=113&rft_id=info:doi/10.29173%2Fmocs271&rft.externalDBID=n%2Fa&rft.externalDocID=10_29173_mocs271 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2562-5438&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2562-5438&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2562-5438&client=summon |