Keyword identification framework for speech communication on construction sites

Worksite communication is a key to boosting teamwork and improving worker performance on the construction worksite. Communication among workers on the construction site mostly consists of speech communication. However, construction sites are typically noisy due to construction tasks like drilling an...

Full description

Saved in:
Bibliographic Details
Published inModular and Offsite Construction (MOC) Summit Proceedings pp. 106 - 113
Main Authors Mansoor, Asif, Liu, Shuai, Ali, Ghulam Muhammad, Bouferguene, Ahmed, Al-Hussein, Mohamed, Hassan, Imran
Format Journal Article
LanguageEnglish
Published 14.09.2022
Online AccessGet full text

Cover

Loading…
Abstract Worksite communication is a key to boosting teamwork and improving worker performance on the construction worksite. Communication among workers on the construction site mostly consists of speech communication. However, construction sites are typically noisy due to construction tasks like drilling and operation of heavy equipment. Meanwhile, workers on construction sites typically represent a range of different ethnic and linguistic backgrounds and have different speaking accents. This can make it difficult for the listener to understand the speaker clearly, leading to miscommunication and errors in decision making on the construction site. Technological advancements in recent years can be leveraged to mitigate this problem. In this paper, a keyword identification framework is developed for speech communication on the construction site. For this framework, 12 hours of raw audio data containing 18 crane signalman speech commands (referred to as “keywords”) are collected. The crane signalman uses specific keywords to communicate with the crane operator and guide the crane operator in the crane operations. The 2-second audio clips (this being the approximate duration of each keyword) are extracted from the raw audio dataset, and construction site noise is added. Moreover, mel-frequency cepstral coefficients are extracted from the waveform audio dataset. The extracted mel-frequency cepstral coefficients, in turn, are used to train the 1-dimensional convolutional neural network. After training, the model is found to achieve a training accuracy of 97.3%, a validation accuracy of 96.1%, and a testing accuracy of 93.8%. The model is further deployed for real-time identification of keywords in speech, with the model achieving an accuracy of 95.3%. In light of these findings, it can be concluded that the developed framework is suitable for real-time application in noisy construction sites for identifying specific keywords in speech.
AbstractList Worksite communication is a key to boosting teamwork and improving worker performance on the construction worksite. Communication among workers on the construction site mostly consists of speech communication. However, construction sites are typically noisy due to construction tasks like drilling and operation of heavy equipment. Meanwhile, workers on construction sites typically represent a range of different ethnic and linguistic backgrounds and have different speaking accents. This can make it difficult for the listener to understand the speaker clearly, leading to miscommunication and errors in decision making on the construction site. Technological advancements in recent years can be leveraged to mitigate this problem. In this paper, a keyword identification framework is developed for speech communication on the construction site. For this framework, 12 hours of raw audio data containing 18 crane signalman speech commands (referred to as “keywords”) are collected. The crane signalman uses specific keywords to communicate with the crane operator and guide the crane operator in the crane operations. The 2-second audio clips (this being the approximate duration of each keyword) are extracted from the raw audio dataset, and construction site noise is added. Moreover, mel-frequency cepstral coefficients are extracted from the waveform audio dataset. The extracted mel-frequency cepstral coefficients, in turn, are used to train the 1-dimensional convolutional neural network. After training, the model is found to achieve a training accuracy of 97.3%, a validation accuracy of 96.1%, and a testing accuracy of 93.8%. The model is further deployed for real-time identification of keywords in speech, with the model achieving an accuracy of 95.3%. In light of these findings, it can be concluded that the developed framework is suitable for real-time application in noisy construction sites for identifying specific keywords in speech.
Author Hassan, Imran
Ali, Ghulam Muhammad
Bouferguene, Ahmed
Al-Hussein, Mohamed
Mansoor, Asif
Liu, Shuai
Author_xml – sequence: 1
  givenname: Asif
  surname: Mansoor
  fullname: Mansoor, Asif
– sequence: 2
  givenname: Shuai
  surname: Liu
  fullname: Liu, Shuai
– sequence: 3
  givenname: Ghulam Muhammad
  surname: Ali
  fullname: Ali, Ghulam Muhammad
– sequence: 4
  givenname: Ahmed
  surname: Bouferguene
  fullname: Bouferguene, Ahmed
– sequence: 5
  givenname: Mohamed
  surname: Al-Hussein
  fullname: Al-Hussein, Mohamed
– sequence: 6
  givenname: Imran
  surname: Hassan
  fullname: Hassan, Imran
BookMark eNpNUM1KAzEYDNKCtS34CHv0svol2SSboxT_sNBL70s2-YJBNynJFunbu9QKwsDMMMMc5obMYopIyC2Fe6ap4g9DsoUpekUWTEhWi4a3s3_6mqxLCT00vGGtZHpBdu94-k7ZVcFhHIMP1owhxcpnM-AUfFY-5aocEO1HZdMwHONfZYJNsYz5aM--hBHLisy9-Sq4vvCS7J-f9pvXert7eds8bmsrFa2lgN61rtdKMMEMB-1AGM17x7y3QlpQ0DeukR5Ma1ruLVWgGSJ4pnEqLsnd76zNqZSMvjvkMJh86ih05yu6yxX8B1uYVW0
ContentType Journal Article
DBID AAYXX
CITATION
DOI 10.29173/mocs271
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList CrossRef
DeliveryMethod fulltext_linktorsrc
EISSN 2562-5438
EndPage 113
ExternalDocumentID 10_29173_mocs271
GroupedDBID AAYXX
ALMA_UNASSIGNED_HOLDINGS
CITATION
M48
ID FETCH-LOGICAL-c671-650bd8db975252a309d05a93bd2ffc56c070b4d46f0a8a83fc17092ee0f29ea93
ISSN 2562-5438
IngestDate Tue Jul 01 01:29:13 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Language English
License http://creativecommons.org/licenses/by-nc-nd/4.0
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c671-650bd8db975252a309d05a93bd2ffc56c070b4d46f0a8a83fc17092ee0f29ea93
OpenAccessLink https://journalofindustrializedconstruction.com/index.php/mocs/article/download/271/235
PageCount 8
ParticipantIDs crossref_primary_10_29173_mocs271
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2022-09-14
PublicationDateYYYYMMDD 2022-09-14
PublicationDate_xml – month: 09
  year: 2022
  text: 2022-09-14
  day: 14
PublicationDecade 2020
PublicationTitle Modular and Offsite Construction (MOC) Summit Proceedings
PublicationYear 2022
SSID ssib043428629
ssib039117972
Score 2.1951883
Snippet Worksite communication is a key to boosting teamwork and improving worker performance on the construction worksite. Communication among workers on the...
SourceID crossref
SourceType Index Database
StartPage 106
Title Keyword identification framework for speech communication on construction sites
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9NAEF6F9MIFFQHirUXiVrk4-7C9x6gCKpDJJaDeon2SHBxXTSxET_x0Zh9xnKiHghRZ0ca7SXY-7Tw83wxC74mzgjvLsooqkTFiqkwZSjKd61xyl1PHPFG4_lZcfmdfrvjVaPRnkLXUbdW5vr2TV_I_UoUxkKtnyf6DZPtFYQDeg3zhChKG671k_NX-_uXpfiuTcn5S5uAu4SrkEG6urdU-vXzABPGPCHS7Lx575p8hb4aGat2akKDq4-oz5_znoblnPwMs03p24YMKPpC22kbKQVCFvZlegx5sYzwAdsf1yT-rLgRdl51c9YCLNO3PS_jW5qzulrJppOmDBW3n7M3Pzqb467JJnKwUrwBX1_dbYPtjDWws330h1nQ5t3eMxaN0khcDrTyJjNXjA5-At-krTzSt3pDYy-WwpvaRruszEMH3CXMXaeYDdELA0SBjdDKdzuY_dmcSFb5m3p5bzCj4a0Voftf_6FjTOCz3IS03sHIG5sr8FD1KfgaeRtA8RiO7foJmCTD4EDC4BwwGwOAIGHwAGAyvIWBwAMxTNP_0cX5xmaWOGpkuykkG1rgylVGi5IQTSXNhci4FVYY4p3mh4fxXzLDC5bKSFXV6UuaCWJs7Iizc-AyN1-3aPkdYlQRMQylAYRDGtRWypIZw7Q1yJZ19gd7tNmBxHeumLI73_OU97nmFHu5B9BqN4W_aN2AGbtXbJKm_jbVj3g
linkProvider Canadian Research Knowledge Network
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Keyword+identification+framework+for+speech+communication+on+construction+sites&rft.jtitle=Modular+and+Offsite+Construction+%28MOC%29+Summit+Proceedings&rft.au=Mansoor%2C+Asif&rft.au=Liu%2C+Shuai&rft.au=Ali%2C+Ghulam+Muhammad&rft.au=Bouferguene%2C+Ahmed&rft.date=2022-09-14&rft.issn=2562-5438&rft.eissn=2562-5438&rft.spage=106&rft.epage=113&rft_id=info:doi/10.29173%2Fmocs271&rft.externalDBID=n%2Fa&rft.externalDocID=10_29173_mocs271
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2562-5438&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2562-5438&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2562-5438&client=summon