Robust Speech Recognition Using Generative Adversarial Networks
Sriram, Anuroop, Jun, Heewoo, Gaur, Yashesh, Satheesh, Sanjeev
Published in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01.04.2018)
Published in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01.04.2018)
Get full text
Conference Proceeding
Minimum Latency Training Strategies for Streaming Sequence-to-Sequence ASR
Inaguma, Hirofumi, Gaur, Yashesh, Lu, Liang, Li, Jinyu, Gong, Yifan
Published in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01.05.2020)
Published in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (01.05.2020)
Get full text
Conference Proceeding
Exploring neural transducers for end-to-end speech recognition
Battenberg, Eric, Jitong Chen, Child, Rewon, Coates, Adam, Li, Yashesh Gaur Yi, Hairong Liu, Satheesh, Sanjeev, Sriram, Anuroop, Zhenyao Zhu
Published in 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) (01.12.2017)
Published in 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) (01.12.2017)
Get full text
Conference Proceeding
Domain Adaptation via Teacher-Student Learning for End-to-End Speech Recognition
Meng, Zhong, Li, Jinyu, Gaur, Yashesh, Gong, Yifan
Published in 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) (01.12.2019)
Published in 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) (01.12.2019)
Get full text
Conference Proceeding
Continuous Streaming Multi-Talker ASR with Dual-Path Transducers
Raj, Desh, Lu, Liang, Chen, Zhuo, Gaur, Yashesh, Li, Jinyu
Published in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (23.05.2022)
Published in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (23.05.2022)
Get full text
Conference Proceeding
Internal Language Model Training for Domain-Adaptive End-To-End Speech Recognition
Meng, Zhong, Kanda, Naoyuki, Gaur, Yashesh, Parthasarathy, Sarangarajan, Sun, Eric, Lu, Liang, Chen, Xie, Li, Jinyu, Gong, Yifan
Published in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (06.06.2021)
Published in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (06.06.2021)
Get full text
Conference Proceeding
CTCBERT: Advancing Hidden-Unit Bert with CTC Objectives
Fan, Ruchao, Wang, Yiming, Gaur, Yashesh, Li, Jinyu
Published in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (04.06.2023)
Published in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (04.06.2023)
Get full text
Conference Proceeding
Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers Using End-to-End Speaker-Attributed ASR
Kanda, Naoyuki, Xiao, Xiong, Gaur, Yashesh, Wang, Xiaofei, Meng, Zhong, Chen, Zhuo, Yoshioka, Takuya
Published in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (23.05.2022)
Published in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (23.05.2022)
Get full text
Conference Proceeding
Leveraging Timestamp Information for Serialized Joint Streaming Recognition and Translation
Papi, Sara, Wang, Peidong, Chen, Junkun, Xue, Jian, Kanda, Naoyuki, Li, Jinyu, Gaur, Yashesh
Published in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (14.04.2024)
Published in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (14.04.2024)
Get full text
Conference Proceeding
VioLA: Conditional Language Models for Speech Recognition, Synthesis, and Translation
Wang, Tianrui, Zhou, Long, Zhang, Ziqiang, Wu, Yu, Liu, Shujie, Gaur, Yashesh, Chen, Zhuo, Li, Jinyu, Wei, Furu
Published in IEEE/ACM transactions on audio, speech, and language processing (2024)
Published in IEEE/ACM transactions on audio, speech, and language processing (2024)
Get full text
Journal Article
Hypothesis Stitcher for End-to-End Speaker-Attributed ASR on Long-Form Multi-Talker Recordings
Chang, Xuankai, Kanda, Naoyuki, Gaur, Yashesh, Wang, Xiaofei, Meng, Zhong, Yoshioka, Takuya
Published in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (06.06.2021)
Published in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (06.06.2021)
Get full text
Conference Proceeding
Minimum Bayes Risk Training for End-to-End Speaker-Attributed ASR
Kanda, Naoyuki, Meng, Zhong, Lu, Liang, Gaur, Yashesh, Wang, Xiaofei, Chen, Zhuo, Yoshioka, Takuya
Published in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (06.06.2021)
Published in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (06.06.2021)
Get full text
Conference Proceeding
Internal Language Model Estimation for Domain-Adaptive End-to-End Speech Recognition
Meng, Zhong, Parthasarathy, Sarangarajan, Sun, Eric, Gaur, Yashesh, Kanda, Naoyuki, Lu, Liang, Chen, Xie, Zhao, Rui, Li, Jinyu, Gong, Yifan
Published in 2021 IEEE Spoken Language Technology Workshop (SLT) (19.01.2021)
Published in 2021 IEEE Spoken Language Technology Workshop (SLT) (19.01.2021)
Get full text
Conference Proceeding
Ensemble Combination between Different Time Segmentations
Wong, Jeremy H. M., Dimitriadis, Dimitrios, Kumatani, Kenichi, Gaur, Yashesh, Polovets, George, Parthasarathy, Partha, Sun, Eric, Li, Jinyu, Gong, Yifan
Published in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (06.06.2021)
Published in ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (06.06.2021)
Get full text
Conference Proceeding
Speech ReaLLM -- Real-time Streaming Speech Recognition with Multimodal LLMs by Teaching the Flow of Time
Seide, Frank, Doulaty, Morrie, Shi, Yangyang, Gaur, Yashesh, Jia, Junteng, Wu, Chunyang
Year of Publication 13.06.2024
Year of Publication 13.06.2024
Get full text
Journal Article
Character-Aware Attention-Based End-to-End Speech Recognition
Meng, Zhong, Gaur, Yashesh, Li, Jinyu, Gong, Yifan
Published in 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) (01.12.2019)
Published in 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) (01.12.2019)
Get full text
Conference Proceeding
Investigation of End-to-End Speaker-Attributed ASR for Continuous Multi-Talker Recordings
Kanda, Naoyuki, Chang, Xuankai, Gaur, Yashesh, Wang, Xiaofei, Meng, Zhong, Chen, Zhuo, Yoshioka, Takuya
Published in 2021 IEEE Spoken Language Technology Workshop (SLT) (19.01.2021)
Published in 2021 IEEE Spoken Language Technology Workshop (SLT) (19.01.2021)
Get full text
Conference Proceeding