VOICE PROCESSING DEVICE, METHOD, AND PROGRAM

To enable filler information to be output to a user before the output of responding voice to uttered voice of the user is started.SOLUTION: Under the control exerted by a voice data acquisition unit 111 and an uttered voice data extraction unit 112, uttered voice data pertaining to uttered voice of...

Full description

Saved in:
Bibliographic Details
Main Authors MATSUMURA NARIMUNE, HOSOBUCHI TAKASHI, NUNOBIKI AYAFUMI
Format Patent
LanguageEnglish
Japanese
Published 22.03.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:To enable filler information to be output to a user before the output of responding voice to uttered voice of the user is started.SOLUTION: Under the control exerted by a voice data acquisition unit 111 and an uttered voice data extraction unit 112, uttered voice data pertaining to uttered voice of a user is acquired. Under the control exerted by a response preparation time prediction unit 113, on the basis of a user utterance time based on the uttered voice data and information concerning response contents data pertaining to uttered voice in the past, a first time required for recognition of the uttered voice concerning the uttered voice data, a second time required for generation of the response contents data, and a third time required for synthesis of responding voice are predicted, and on the basis of the predicted first, second, and third times, a delay time required since the end point of the uttered voice of the user before the start of output of the responding voice is predicted. Under the control exerted by a filler information output unit 114, filler voice data in accordance with the predicted delay time is output to a speaker 15 within the delay time.SELECTED DRAWING: Figure 2 【課題】ユーザからの発話音声に対する応答音声の出力が開始されるまでに、ユーザにフィラー情報を出力できるようにする。【解決手段】音声データ取得部111および発話音声データ抽出部112の制御の下、ユーザからの発話音声に係る発話音声データが取得される。応答準備時間予測部113の制御の下、当該発話音声データに基づくユーザ発話時間と、過去の発話音声に係る応答内容データに関する情報とに基づいて、上記発話音声データに関する発話音声の認識に要する第1の時間、応答内容データの生成に要する第2の時間、および応答音声の合成に要する第3の時間が予測され、予測された第1、第2および第3の時間に基づいて、ユーザからの発話音声の終了時点から応答音声の出力を開始するまでに要する遅延時間が予測される。フィラー情報出力部114の制御の下、予測された遅延時間に応じたフィラー音声データが上記遅延時間内にスピーカ15に出力される。【選択図】図2
Bibliography:Application Number: JP20170172162