언어장애인의 스마트스피커 접근성 향상을 위한 개인화된 음성 분류 기법

With the spread of smart speakers based on voice recognition technology and deep learning technology, not only non-disabled people, but also the blind or physically handicapped can easily control home appliances such as lights and TVs through voice by linking home network services. This has greatly...

Full description

Saved in:

Bibliographic Details
Published in	스마트미디어저널 Vol. 11; no. 11; pp. 17 - 24
Main Authors	이승권, 전광일, 최우진
Format	Journal Article
Language	Korean
Published	한국스마트미디어학회 31.12.2022 Korean Institute of Smart Media
Subjects	deep learning personalized speech classification scheme smart speaker speech-impaired people 스마트스피커 딥러닝 장애인접근성 개인화된 음성분류기법 언어장애인 disabled accessibility
Online Access	Get full text
ISSN	2287-1322

Cover

Abstract	With the spread of smart speakers based on voice recognition technology and deep learning technology, not only non-disabled people, but also the blind or physically handicapped can easily control home appliances such as lights and TVs through voice by linking home network services. This has greatly improved the quality of life. However, in the case of speech-impaired people, it is impossible to use the useful services of the smart speaker because they have inaccurate pronunciation due to articulation or speech disorders. In this paper, we propose a personalized voice classification technique for the speech-impaired to use for some of the functions provided by the smart speaker. The goal of this paper is to increase the recognition rate and accuracy of sentences spoken by speech-impaired people even with a small amount of data and a short learning time so that the service provided by the smart speaker can be actually used. In this paper, data augmentation and one cycle learning rate optimization technique were applied while fine-tuning ResNet18 model. Through an experiment, after recording 10 times for each 30 smart speaker commands, and learning within 3 minutes, the speech classification recognition rate was about 95.2%. 음성인식 기술과 인공지능 기술을 기반으로 한 스마트스피커의 보급으로 비장애인뿐만 아니라 시각장애인이나 지체장애인들도 홈 네트워크 서비스를 연동하여 주택의 전등이나 TV와 같은 가전제품을 음성을 통해 쉽게 제어할 수 있게 되어 삶의 질이 대폭 향상되었다. 하지만 언어장애인의 경우 조음장애나 구음장애 등으로 부정확한 발음을 하게 됨으로서 스마트스피커의 유용한 서비스를 사용하는 것이 불가능하다. 본 논문에서는 스마트스피커에서 제공되는 기능 중 일부 서비스를 대상으로 언어장애인이 이용할 수 있도록 개인화된 음성분류기법을 제안한다. 본 논문에서는 소량의 데이터와 짧은 학습시간으로도 언어장애인이 구사하는 문장의 인식률과 정확도를 높여 스마트스피커가 제공하는 서비스를 실제로 이용할 수 있도록 하는 것이 목표이다. 본 논문에서는 ResNet18 모델을 fine tuning하고 데이터 증강과 one cycle learning rate 최적화 기법을 추가하여 적용하였으며, 실험을 통하여 30개의 스마트스피커 명령어 별로 10회 녹음한 후 3분 이내로 학습할 경우 음성분류 정확도가 95.2% 정도가 됨을 보였다.
AbstractList	With the spread of smart speakers based on voice recognition technology and deep learning technology, not only non-disabled people, but also the blind or physically handicapped can easily control home appliances such as lights and TVs through voice by linking home network services. This has greatly improved the quality of life. However, in the case of speech-impaired people, it is impossible to use the useful services of the smart speaker because they have inaccurate pronunciation due to articulation or speech disorders. In this paper, we propose a personalized voice classification technique for the speech-impaired to use for some of the functions provided by the smart speaker. The goal of this paper is to increase the recognition rate and accuracy of sentences spoken by speech-impaired people even with a small amount of data and a short learning time so that the service provided by the smart speaker can be actually used. In this paper, data augmentation and one cycle learning rate optimization technique were applied while fine-tuning ResNet18 model. Through an experiment, after recording 10 times for each 30 smart speaker commands, and learning within 3 minutes, the speech classification recognition rate was about 95.2%. 음성인식 기술과 인공지능 기술을 기반으로 한 스마트스피커의 보급으로 비장애인뿐만 아니라 시각장애인이나 지체장애인들도 홈 네트워크 서비스를 연동하여 주택의 전등이나 TV와 같은 가전제품을 음성을 통해 쉽게 제어할 수 있게 되어 삶의 질이 대폭 향상되었다. 하지만 언어장애인의 경우 조음장애나 구음장애 등으로 부정확한 발음을 하게 됨으로서 스마트스피커의 유용한 서비스를 사용하는 것이 불가능하다. 본 논문에서는 스마트스피커에서 제공되는 기능 중 일부 서비스를 대상으로 언어장애인이 이용할 수 있도록 개인화된 음성분류기법을 제안한다. 본 논문에서는 소량의 데이터와 짧은 학습시간으로도 언어장애인이 구사하는 문장의 인식률과 정확도를 높여 스마트스피커가 제공하는 서비스를 실제로 이용할 수 있도록 하는 것이 목표이다. 본 논문에서는 ResNet18 모델을 fine tuning하고 데이터 증강과 one cycle learning rate 최적화 기법을 추가하여 적용하였으며, 실험을 통하여 30개의 스마트스피커 명령어 별로 10회 녹음한 후 3분 이내로 학습할 경우 음성분류 정확도가 95.2% 정도가 됨을 보였다.
Author	이승권 전광일 최우진
Author_xml	– sequence: 1 fullname: 이승권 – sequence: 2 fullname: 전광일 – sequence: 3 fullname: 최우진
BookMark	eNpNjD9Lw0Achm-oYK39Drc4Bn53Se4uYyn-L3TpHnLNBUpLAqaLo5hBioNCWyok0qHFtYpowH6i5PIdrOjg9LzD8z4HqBZGoaqhOqWCG8SkdB8143gggVBiMcJZHUk9z_X8XT-v9Wyps1xnC6wnq_Llrprku1FNt_prhfXysfjc6uQVV_O1vr3RWYJ1mlSzFBebdPernqblQ4p1dv8jlR9JuV7gIt-Ub7NDtBd4o1g1_9hAvZPjXvvM6HRPz9utjjFkAIbNCDN9r0-EQzgBaivFqRMoLrzAAUsF0gqYxTmRHrFEAGYguc-kL4TPQNrcbKCj3-xwEI8HbujHI_eiddmlQCkhtk0FZbbz37uOZOTKKBr2VThWV64FBMDkAJQRML8BOyp6vQ
ContentType	Journal Article
Copyright	COPYRIGHT(C) KYOBO BOOK CENTRE ALL RIGHTS RESERVED
Copyright_xml	– notice: COPYRIGHT(C) KYOBO BOOK CENTRE ALL RIGHTS RESERVED
DBID	P5Y SSSTE JDI
DEWEY	302.23 657.84
DatabaseName	교보문고 스콜라 Scholar(스콜라) KoreaScience
DatabaseTitleList
DeliveryMethod	fulltext_linktorsrc
Discipline	Anthropology Social Sciences (General) Business
DocumentTitleAlternate	Personalized Speech Classification Scheme for the Smart Speaker Accessibility Improvement of the Speech-Impaired people
EndPage	24
ExternalDocumentID	JAKO202211552826597 4010037002610
GroupedDBID	P5Y SSSTE .UV JDI
ID	FETCH-LOGICAL-k600-56163dac189171025ee729fe78af904efb4f64771ba148f03fb7d6bd88d60b573
ISSN	2287-1322
IngestDate	Fri Dec 22 12:03:24 EST 2023 Thu Apr 10 10:28:03 EDT 2025
IsOpenAccess	true
IsPeerReviewed	false
IsScholarly	false
Issue	11
Keywords	deep learning personalized speech classification scheme smart speaker speech-impaired people 스마트스피커 딥러닝 장애인접근성 개인화된 음성분류기법 언어장애인 disabled accessibility
Language	Korean
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-k600-56163dac189171025ee729fe78af904efb4f64771ba148f03fb7d6bd88d60b573
Notes	KISTI1.1003/JNL.JAKO202211552826597
OpenAccessLink	http://click.ndsl.kr/servlet/LinkingDetailView?cn=JAKO202211552826597&dbt=JAKO&org_code=O481&site_code=SS1481&service_code=01
PageCount	8
ParticipantIDs	kisti_ndsl_JAKO202211552826597 kyobo_bookcenter_4010037002610
PublicationCentury	2000
PublicationDate	2022-12-31
PublicationDateYYYYMMDD	2022-12-31
PublicationDate_xml	– month: 12 year: 2022 text: 2022-12-31 day: 31
PublicationDecade	2020
PublicationTitle	스마트미디어저널
PublicationTitleAlternate	Smart media journal
PublicationYear	2022
Publisher	한국스마트미디어학회 Korean Institute of Smart Media
Publisher_xml	– name: 한국스마트미디어학회 – name: Korean Institute of Smart Media
SSID	ssib012146176 ssib036278714 ssib022315842 ssib051117086 ssib044760798
Score	1.8169761
Snippet	With the spread of smart speakers based on voice recognition technology and deep learning technology, not only non-disabled people, but also the blind or...
SourceID	kisti kyobo
SourceType	Open Access Repository Publisher
StartPage	17
TableOfContents	서론 Ⅱ. 관련 연구 Ⅲ. 언어장애인을 위한 개인화된 음성모델 학습 Ⅳ. 실험 및 결과 Ⅴ. 결론 REFERENCES
Title	언어장애인의 스마트스피커 접근성 향상을 위한 개인화된 음성 분류 기법
URI	https://scholar.kyobobook.co.kr/article/detail/4010037002610 http://click.ndsl.kr/servlet/LinkingDetailView?cn=JAKO202211552826597&dbt=JAKO&org_code=O481&site_code=SS1481&service_code=01
Volume	11
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3Na9RAFA9tT15EUfGz5OCcSiSTTJKZY7K7pVSqlxV6WzKbBKSyC3V70INQ3IMUDwptqbArPbR4rSK6YP-i3dn_wfcmyX6I-NFL9jHzm_fezJtk3kv2zRjG_czPpISnnOUIHlsM5oQVM7tppTCbGPNg1YgxUXjjkb_2hK1vepsLi7uz2SUd-aD58rd5JRexKpSBXTFL9j8sO2EKBUCDfeEKFobrP9mY1CpE-CTiJcE0USOhpwmPhLYmqiWmSgRfQYqHJAR0RMKAcKirYkkOKqoAykhUwZIIYEw3A36CkhpAg6KOA4iuaLhfyOUu-KeFOJ63ExqIIA_oFc3BxtKpdlAnUCToJOwclHOYExORyNesIhSmO6P1Bm5Y55D8MM3S3_5jV6FktSBArmBzowg95ZonCBOTbzqlvqxgHAktH1SqzEKwMdM1TI9X3mgOEnll9_VQYCNQj82-hXGccq_H4r6ZDKBmDdatXrR_OR-hibBUXq8HDgS3Fr48mFu86OxNSmeWojwltnBq8jz1X3YWh9Aatx_SYbi9aCy6FBeKjVe18oFM9anvU38VXEkKDuvEHwbnB57308_wjAW-HUzjd3DmaWDr01YnykMQiJHRU_x90Zaz3lz9inG5CMPMML-nrhoLW-1rhlSHA3X4VX08VQfHqj9Q_SNT7Z2MPr0Z7w2AGO-fqx8npjp-P_x-rrqfzfHhqXq9q_pdU_W644OeOTzrQbvxh_3Ru56p-m8RNPrWHZ0emcPB2ejLwXWjvlqrV9as4gwSawtCAQuiC99N4iblgqIv7qUpRKNZGvA4EzZLM8kyTOWmMqaMZ7abySDxZcJ54tvSC9wbxlKr3UpvGqb03NilsokRO3OozVOeBXYzwVR0L5HBLWNZD0yjlTx_1lgPHz7GeUZxi0Tu-BD3IwBHrIHhN_51O91uzJnw9t8Ad4xL07l711jqbO-k98Cn7shlbfef786fkg
linkProvider	ISSN International Centre
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=%EC%96%B8%EC%96%B4%EC%9E%A5%EC%95%A0%EC%9D%B8%EC%9D%98+%EC%8A%A4%EB%A7%88%ED%8A%B8%EC%8A%A4%ED%94%BC%EC%BB%A4+%EC%A0%91%EA%B7%BC%EC%84%B1+%ED%96%A5%EC%83%81%EC%9D%84+%EC%9C%84%ED%95%9C+%EA%B0%9C%EC%9D%B8%ED%99%94%EB%90%9C+%EC%9D%8C%EC%84%B1+%EB%B6%84%EB%A5%98+%EA%B8%B0%EB%B2%95&rft.jtitle=%EC%8A%A4%EB%A7%88%ED%8A%B8%EB%AF%B8%EB%94%94%EC%96%B4%EC%A0%80%EB%84%90&rft.au=%EC%9D%B4%EC%8A%B9%EA%B6%8C&rft.au=%EC%A0%84%EA%B4%91%EC%9D%BC&rft.au=%EC%B5%9C%EC%9A%B0%EC%A7%84&rft.date=2022-12-31&rft.pub=%ED%95%9C%EA%B5%AD%EC%8A%A4%EB%A7%88%ED%8A%B8%EB%AF%B8%EB%94%94%EC%96%B4%ED%95%99%ED%9A%8C&rft.issn=2287-1322&rft.volume=11&rft.issue=11&rft.spage=17&rft.epage=24&rft.externalDocID=4010037002610
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2287-1322&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2287-1322&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2287-1322&client=summon