I-Vector Based Clustering Training Data in Speech Recognition Q Huo, ZJ Yan, Y Zhang, J Xu US Patent App. 13/640,804, 2015 | 245 | 2015 |
Deep-FSMN for large vocabulary continuous speech recognition S Zhang, M Lei, Z Yan, L Dai 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 128 | 2018 |
Qwen-audio: Advancing universal audio understanding via unified large-scale audio-language models Y Chu, J Xu, X Zhou, Q Yang, S Zhang, Z Yan, C Zhou, J Zhou arXiv preprint arXiv:2311.07919, 2023 | 100 | 2023 |
M2MeT: The ICASSP 2022 multi-channel multi-party meeting transcription challenge F Yu, S Zhang, Y Fu, L Xie, S Zheng, Z Du, W Huang, P Guo, Z Yan, B Ma, ... ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 83 | 2022 |
Paraformer: Fast and accurate parallel transformer for non-autoregressive end-to-end speech recognition Z Gao, S Zhang, I McLoughlin, Z Yan arXiv preprint arXiv:2206.08317, 2022 | 65 | 2022 |
A unified trajectory tiling approach to high quality speech rendering Y Qian, FK Soong, ZJ Yan IEEE transactions on audio, speech, and language processing 21 (2), 280-290, 2012 | 65 | 2012 |
A scalable approach to using DNN-derived features in GMM-HMM based acoustic modeling for LVCSR. ZJ Yan, Q Huo, J Xu Interspeech, 104-108, 2013 | 64 | 2013 |
Improving latency-controlled BLSTM acoustic models for online speech recognition S Xue, Z Yan 2017 IEEE International Conference on Acoustics, Speech and Signal …, 2017 | 63 | 2017 |
A context-sensitive-chunk BPTT approach to training deep LSTM/BLSTM recurrent neural networks for offline handwriting recognition K Chen, ZJ Yan, Q Huo 2015 13th International Conference on Document Analysis and Recognition …, 2015 | 43 | 2015 |
Investigation of Transformer Based Spelling Correction Model for CTC-Based End-to-End Mandarin Speech Recognition. S Zhang, M Lei, Z Yan Interspeech, 2180-2184, 2019 | 38 | 2019 |
Prosospeech: Enhancing prosody with quantized vector pre-training in text-to-speech Y Ren, M Lei, Z Huang, S Zhang, Q Chen, Z Yan, Z Zhao ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 37 | 2022 |
Method and apparatus for initiating an operation using voice data XU Minqiang, Z Yan, J Gao, M Chu US Patent App. 15/292,632, 2017 | 37 | 2017 |
Rich-context unit selection (RUS) approach to high quality TTS ZJ Yan, Y Qian, FK Soong 2010 IEEE International Conference on Acoustics, Speech and Signal …, 2010 | 37 | 2010 |
Rich context modeling for high quality HMM-based TTS ZJ Yan, Y Qian, FK Soong Tenth Annual Conference of the International Speech Communication Association, 2009 | 36 | 2009 |
Trajectory Tiling Approach for Text-to-Speech Y Qian, ZJ Yan, YJ Wu, FKP Soong US Patent App. 12/962,543, 2012 | 34 | 2012 |
Improved modeling for F0 generation and V/U decision in HMM-based TTS Q Zhang, F Soong, Y Qian, Z Yan, J Pan, Y Yan 2010 IEEE International Conference on Acoustics, Speech and Signal …, 2010 | 34 | 2010 |
Streaming chunk-aware multihead attention for online end-to-end speech recognition S Zhang, Z Gao, H Luo, M Lei, J Gao, Z Yan, L Xie arXiv preprint arXiv:2006.01712, 2020 | 30 | 2020 |
Lauragpt: Listen, attend, understand, and regenerate audio with gpt Q Chen, Y Chu, Z Gao, Z Li, K Hu, X Zhou, J Xu, Z Ma, W Wang, S Zheng, ... arXiv preprint arXiv:2310.04673, 2023 | 28 | 2023 |
An i-vector Based Approach to Training Data Clustering for Improved Speech Recognition. Y Zhang, J Xu, ZJ Yan, Q Huo INTERSPEECH, 789-792, 2011 | 28 | 2011 |
Summary on the ICASSP 2022 multi-channel multi-party meeting transcription grand challenge F Yu, S Zhang, P Guo, Y Fu, Z Du, S Zheng, W Huang, L Xie, ZH Tan, ... ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 27 | 2022 |