Brian Yan
Cited by
Cited by
ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
S Arora, S Dalmia, P Denisov, X Chang, Y Ueda, Y Peng, Y Zhang, ...
ICASSP 2022, 2022
Searchable hidden intermediates for end-to-end models of decomposable sequence tasks
S Dalmia, B Yan, V Raunak, F Metze, S Watanabe
NAACL 2021, 2021
CTC Alignments Improve Autoregressive Translation
B Yan, S Dalmia, Y Higuchi, G Neubig, F Metze, AW Black, S Watanabe
EACL 2023, 2022
Improving massively multilingual ASR with auxiliary CTC objectives
W Chen, B Yan, J Shi, Y Peng, S Maiti, S Watanabe
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
Prompting the hidden talent of web-scale speech models for zero-shot task generalization
P Peng, B Yan, S Watanabe, D Harwath
arXiv preprint arXiv:2305.11095, 2023
Exploration of efficient end-to-end asr using discretized input from self-supervised learning
X Chang, B Yan, Y Fujita, T Maekaku, S Watanabe
arXiv preprint arXiv:2305.18108, 2023
BERT meets CTC: New formulation of end-to-end speech recognition with pre-trained masked language model
Y Higuchi, B Yan, S Arora, T Ogawa, T Kobayashi, S Watanabe
EMNLP 2022, 2022
ESPnet-ST IWSLT 2021 Offline Speech Translation System
H Inaguma, B Yan, S Dalmia, P Gu, J Shi, K Duh, S Watanabe
IWSLT 2021, 2021
Reproducing whisper-style training using an open-source toolkit and publicly available data
Y Peng, J Tian, B Yan, D Berrebbi, X Chang, X Li, J Shi, S Arora, W Chen, ...
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-8, 2023
ESPnet-SE++: Speech enhancement for robust speech recognition, translation, and understanding
YJ Lu, X Chang, C Li, W Zhang, S Cornell, Z Ni, Y Masuyama, B Yan, ...
arXiv preprint arXiv:2207.09514, 2022
Combining spectral and self-supervised features for low resource speech recognition and translation
D Berrebbi, J Shi, B Yan, O López-Francisco, JD Amith, S Watanabe
arXiv preprint arXiv:2204.02470, 2022
Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization
B Yan, C Zhang, M Yu, SX Zhang, S Dalmia, D Berrebbi, C Weng, ...
ICASSP 2022, 2022
Two-pass low latency end-to-end spoken language understanding
S Arora, S Dalmia, X Chang, B Yan, A Black, S Watanabe
arXiv preprint arXiv:2207.06670, 2022
Exploring speech recognition, translation, and understanding with discrete speech units: A comparative study
X Chang, B Yan, K Choi, JW Jung, Y Lu, S Maiti, R Sharma, J Shi, J Tian, ...
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden Intermediates
H Inaguma, S Dalmia, B Yan, S Watanabe
ASRU 2021, 2021
Differentiable Allophone Graphs for Language-Universal Speech Recognition
B Yan, S Dalmia, DR Mortensen, F Metze, S Watanabe
INTERSPEECH 2021, 2021
CMU’s IWSLT 2022 dialect speech translation system
B Yan, P Fernandes, S Dalmia, J Shi, Y Peng, D Berrebbi, X Wang, ...
Proceedings of the 19th International Conference on Spoken Language …, 2022
OWSM v3. 1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer
Y Peng, J Tian, W Chen, S Arora, B Yan, Y Sudo, M Shakeel, K Choi, ...
arXiv preprint arXiv:2401.16658, 2024
A comparative study on E-branchformer vs conformer in speech recognition, translation, and understanding tasks
Y Peng, K Kim, F Wu, B Yan, S Arora, W Chen, J Tang, S Shon, P Sridhar, ...
arXiv preprint arXiv:2305.11073, 2023
4D ASR: Joint modeling of CTC, attention, transducer, and mask-predict decoders
Y Sudo, M Shakeel, B Yan, J Shi, S Watanabe
arXiv preprint arXiv:2212.10818, 2022
The system can't perform the operation now. Try again later.
Articles 1–20