언어장애인의 스마트스피커 접근성 향상을 위한 개인화된 음성 분류 기법

With the spread of smart speakers based on voice recognition technology and deep learning technology, not only non-disabled people, but also the blind or physically handicapped can easily control home appliances such as lights and TVs through voice by linking home network services. This has greatly...

Full description

Saved in:
Bibliographic Details
Published in스마트미디어저널 Vol. 11; no. 11; pp. 17 - 24
Main Authors 이승권, 전광일, 최우진
Format Journal Article
LanguageKorean
Published 한국스마트미디어학회 31.12.2022
Korean Institute of Smart Media
Subjects
Online AccessGet full text
ISSN2287-1322

Cover

Abstract With the spread of smart speakers based on voice recognition technology and deep learning technology, not only non-disabled people, but also the blind or physically handicapped can easily control home appliances such as lights and TVs through voice by linking home network services. This has greatly improved the quality of life. However, in the case of speech-impaired people, it is impossible to use the useful services of the smart speaker because they have inaccurate pronunciation due to articulation or speech disorders. In this paper, we propose a personalized voice classification technique for the speech-impaired to use for some of the functions provided by the smart speaker. The goal of this paper is to increase the recognition rate and accuracy of sentences spoken by speech-impaired people even with a small amount of data and a short learning time so that the service provided by the smart speaker can be actually used. In this paper, data augmentation and one cycle learning rate optimization technique were applied while fine-tuning ResNet18 model. Through an experiment, after recording 10 times for each 30 smart speaker commands, and learning within 3 minutes, the speech classification recognition rate was about 95.2%. 음성인식 기술과 인공지능 기술을 기반으로 한 스마트스피커의 보급으로 비장애인뿐만 아니라 시각장애인이나 지체장애인들도 홈 네트워크 서비스를 연동하여 주택의 전등이나 TV와 같은 가전제품을 음성을 통해 쉽게 제어할 수 있게 되어 삶의 질이 대폭 향상되었다. 하지만 언어장애인의 경우 조음장애나 구음장애 등으로 부정확한 발음을 하게 됨으로서 스마트스피커의 유용한 서비스를 사용하는 것이 불가능하다. 본 논문에서는 스마트스피커에서 제공되는 기능 중 일부 서비스를 대상으로 언어장애인이 이용할 수 있도록 개인화된 음성분류기법을 제안한다. 본 논문에서는 소량의 데이터와 짧은 학습시간으로도 언어장애인이 구사하는 문장의 인식률과 정확도를 높여 스마트스피커가 제공하는 서비스를 실제로 이용할 수 있도록 하는 것이 목표이다. 본 논문에서는 ResNet18 모델을 fine tuning하고 데이터 증강과 one cycle learning rate 최적화 기법을 추가하여 적용하였으며, 실험을 통하여 30개의 스마트스피커 명령어 별로 10회 녹음한 후 3분 이내로 학습할 경우 음성분류 정확도가 95.2% 정도가 됨을 보였다.
AbstractList With the spread of smart speakers based on voice recognition technology and deep learning technology, not only non-disabled people, but also the blind or physically handicapped can easily control home appliances such as lights and TVs through voice by linking home network services. This has greatly improved the quality of life. However, in the case of speech-impaired people, it is impossible to use the useful services of the smart speaker because they have inaccurate pronunciation due to articulation or speech disorders. In this paper, we propose a personalized voice classification technique for the speech-impaired to use for some of the functions provided by the smart speaker. The goal of this paper is to increase the recognition rate and accuracy of sentences spoken by speech-impaired people even with a small amount of data and a short learning time so that the service provided by the smart speaker can be actually used. In this paper, data augmentation and one cycle learning rate optimization technique were applied while fine-tuning ResNet18 model. Through an experiment, after recording 10 times for each 30 smart speaker commands, and learning within 3 minutes, the speech classification recognition rate was about 95.2%. 음성인식 기술과 인공지능 기술을 기반으로 한 스마트스피커의 보급으로 비장애인뿐만 아니라 시각장애인이나 지체장애인들도 홈 네트워크 서비스를 연동하여 주택의 전등이나 TV와 같은 가전제품을 음성을 통해 쉽게 제어할 수 있게 되어 삶의 질이 대폭 향상되었다. 하지만 언어장애인의 경우 조음장애나 구음장애 등으로 부정확한 발음을 하게 됨으로서 스마트스피커의 유용한 서비스를 사용하는 것이 불가능하다. 본 논문에서는 스마트스피커에서 제공되는 기능 중 일부 서비스를 대상으로 언어장애인이 이용할 수 있도록 개인화된 음성분류기법을 제안한다. 본 논문에서는 소량의 데이터와 짧은 학습시간으로도 언어장애인이 구사하는 문장의 인식률과 정확도를 높여 스마트스피커가 제공하는 서비스를 실제로 이용할 수 있도록 하는 것이 목표이다. 본 논문에서는 ResNet18 모델을 fine tuning하고 데이터 증강과 one cycle learning rate 최적화 기법을 추가하여 적용하였으며, 실험을 통하여 30개의 스마트스피커 명령어 별로 10회 녹음한 후 3분 이내로 학습할 경우 음성분류 정확도가 95.2% 정도가 됨을 보였다.
Author 이승권
전광일
최우진
Author_xml – sequence: 1
  fullname: 이승권
– sequence: 2
  fullname: 전광일
– sequence: 3
  fullname: 최우진
BookMark eNpNjD9Lw0Achm-oYK39Drc4Bn53Se4uYyn-L3TpHnLNBUpLAqaLo5hBioNCWyok0qHFtYpowH6i5PIdrOjg9LzD8z4HqBZGoaqhOqWCG8SkdB8143gggVBiMcJZHUk9z_X8XT-v9Wyps1xnC6wnq_Llrprku1FNt_prhfXysfjc6uQVV_O1vr3RWYJ1mlSzFBebdPernqblQ4p1dv8jlR9JuV7gIt-Ub7NDtBd4o1g1_9hAvZPjXvvM6HRPz9utjjFkAIbNCDN9r0-EQzgBaivFqRMoLrzAAUsF0gqYxTmRHrFEAGYguc-kL4TPQNrcbKCj3-xwEI8HbujHI_eiddmlQCkhtk0FZbbz37uOZOTKKBr2VThWV64FBMDkAJQRML8BOyp6vQ
ContentType Journal Article
Copyright COPYRIGHT(C) KYOBO BOOK CENTRE ALL RIGHTS RESERVED
Copyright_xml – notice: COPYRIGHT(C) KYOBO BOOK CENTRE ALL RIGHTS RESERVED
DBID P5Y
SSSTE
JDI
DEWEY 302.23
657.84
DatabaseName 교보문고 스콜라
Scholar(스콜라)
KoreaScience
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Anthropology
Social Sciences (General)
Business
DocumentTitleAlternate Personalized Speech Classification Scheme for the Smart Speaker Accessibility Improvement of the Speech-Impaired people
EndPage 24
ExternalDocumentID JAKO202211552826597
4010037002610
GroupedDBID P5Y
SSSTE
.UV
JDI
ID FETCH-LOGICAL-k600-56163dac189171025ee729fe78af904efb4f64771ba148f03fb7d6bd88d60b573
ISSN 2287-1322
IngestDate Fri Dec 22 12:03:24 EST 2023
Thu Apr 10 10:28:03 EDT 2025
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Issue 11
Keywords deep learning
personalized speech classification scheme
smart speaker
speech-impaired people
스마트스피커
딥러닝
장애인접근성
개인화된 음성분류기법
언어장애인
disabled accessibility
Language Korean
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-k600-56163dac189171025ee729fe78af904efb4f64771ba148f03fb7d6bd88d60b573
Notes KISTI1.1003/JNL.JAKO202211552826597
OpenAccessLink http://click.ndsl.kr/servlet/LinkingDetailView?cn=JAKO202211552826597&dbt=JAKO&org_code=O481&site_code=SS1481&service_code=01
PageCount 8
ParticipantIDs kisti_ndsl_JAKO202211552826597
kyobo_bookcenter_4010037002610
PublicationCentury 2000
PublicationDate 2022-12-31
PublicationDateYYYYMMDD 2022-12-31
PublicationDate_xml – month: 12
  year: 2022
  text: 2022-12-31
  day: 31
PublicationDecade 2020
PublicationTitle 스마트미디어저널
PublicationTitleAlternate Smart media journal
PublicationYear 2022
Publisher 한국스마트미디어학회
Korean Institute of Smart Media
Publisher_xml – name: 한국스마트미디어학회
– name: Korean Institute of Smart Media
SSID ssib012146176
ssib036278714
ssib022315842
ssib051117086
ssib044760798
Score 1.8169761
Snippet With the spread of smart speakers based on voice recognition technology and deep learning technology, not only non-disabled people, but also the blind or...
SourceID kisti
kyobo
SourceType Open Access Repository
Publisher
StartPage 17
TableOfContents 서론 Ⅱ. 관련 연구 Ⅲ. 언어장애인을 위한 개인화된 음성모델 학습 Ⅳ. 실험 및 결과 Ⅴ. 결론 REFERENCES
Title 언어장애인의 스마트스피커 접근성 향상을 위한 개인화된 음성 분류 기법
URI https://scholar.kyobobook.co.kr/article/detail/4010037002610
http://click.ndsl.kr/servlet/LinkingDetailView?cn=JAKO202211552826597&dbt=JAKO&org_code=O481&site_code=SS1481&service_code=01
Volume 11
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3Na9RAFA9tT15EUfGz5OCcSiSTTJKZY7K7pVSqlxV6WzKbBKSyC3V70INQ3IMUDwptqbArPbR4rSK6YP-i3dn_wfcmyX6I-NFL9jHzm_fezJtk3kv2zRjG_czPpISnnOUIHlsM5oQVM7tppTCbGPNg1YgxUXjjkb_2hK1vepsLi7uz2SUd-aD58rd5JRexKpSBXTFL9j8sO2EKBUCDfeEKFobrP9mY1CpE-CTiJcE0USOhpwmPhLYmqiWmSgRfQYqHJAR0RMKAcKirYkkOKqoAykhUwZIIYEw3A36CkhpAg6KOA4iuaLhfyOUu-KeFOJ63ExqIIA_oFc3BxtKpdlAnUCToJOwclHOYExORyNesIhSmO6P1Bm5Y55D8MM3S3_5jV6FktSBArmBzowg95ZonCBOTbzqlvqxgHAktH1SqzEKwMdM1TI9X3mgOEnll9_VQYCNQj82-hXGccq_H4r6ZDKBmDdatXrR_OR-hibBUXq8HDgS3Fr48mFu86OxNSmeWojwltnBq8jz1X3YWh9Aatx_SYbi9aCy6FBeKjVe18oFM9anvU38VXEkKDuvEHwbnB57308_wjAW-HUzjd3DmaWDr01YnykMQiJHRU_x90Zaz3lz9inG5CMPMML-nrhoLW-1rhlSHA3X4VX08VQfHqj9Q_SNT7Z2MPr0Z7w2AGO-fqx8npjp-P_x-rrqfzfHhqXq9q_pdU_W644OeOTzrQbvxh_3Ru56p-m8RNPrWHZ0emcPB2ejLwXWjvlqrV9as4gwSawtCAQuiC99N4iblgqIv7qUpRKNZGvA4EzZLM8kyTOWmMqaMZ7abySDxZcJ54tvSC9wbxlKr3UpvGqb03NilsokRO3OozVOeBXYzwVR0L5HBLWNZD0yjlTx_1lgPHz7GeUZxi0Tu-BD3IwBHrIHhN_51O91uzJnw9t8Ad4xL07l711jqbO-k98Cn7shlbfef786fkg
linkProvider ISSN International Centre
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=%EC%96%B8%EC%96%B4%EC%9E%A5%EC%95%A0%EC%9D%B8%EC%9D%98+%EC%8A%A4%EB%A7%88%ED%8A%B8%EC%8A%A4%ED%94%BC%EC%BB%A4+%EC%A0%91%EA%B7%BC%EC%84%B1+%ED%96%A5%EC%83%81%EC%9D%84+%EC%9C%84%ED%95%9C+%EA%B0%9C%EC%9D%B8%ED%99%94%EB%90%9C+%EC%9D%8C%EC%84%B1+%EB%B6%84%EB%A5%98+%EA%B8%B0%EB%B2%95&rft.jtitle=%EC%8A%A4%EB%A7%88%ED%8A%B8%EB%AF%B8%EB%94%94%EC%96%B4%EC%A0%80%EB%84%90&rft.au=%EC%9D%B4%EC%8A%B9%EA%B6%8C&rft.au=%EC%A0%84%EA%B4%91%EC%9D%BC&rft.au=%EC%B5%9C%EC%9A%B0%EC%A7%84&rft.date=2022-12-31&rft.pub=%ED%95%9C%EA%B5%AD%EC%8A%A4%EB%A7%88%ED%8A%B8%EB%AF%B8%EB%94%94%EC%96%B4%ED%95%99%ED%9A%8C&rft.issn=2287-1322&rft.volume=11&rft.issue=11&rft.spage=17&rft.epage=24&rft.externalDocID=4010037002610
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2287-1322&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2287-1322&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2287-1322&client=summon