Follow
Brian Yan
Title
Cited by
Cited by
Year
ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
S Arora, S Dalmia, P Denisov, X Chang, Y Ueda, Y Peng, Y Zhang, ...
ICASSP 2022, 2022
792022
Exploration of efficient end-to-end asr using discretized input from self-supervised learning
X Chang, B Yan, Y Fujita, T Maekaku, S Watanabe
arXiv preprint arXiv:2305.18108, 2023
442023
Prompting the hidden talent of web-scale speech models for zero-shot task generalization
P Peng, B Yan, S Watanabe, D Harwath
INTERSPEECH 2023, 2023
402023
Improving massively multilingual asr with auxiliary ctc objectives
W Chen, B Yan, J Shi, Y Peng, S Maiti, S Watanabe
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
382023
Reproducing whisper-style training using an open-source toolkit and publicly available data
Y Peng, J Tian, B Yan, D Berrebbi, X Chang, X Li, J Shi, S Arora, W Chen, ...
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-8, 2023
332023
CTC Alignments Improve Autoregressive Translation
B Yan, S Dalmia, Y Higuchi, G Neubig, F Metze, AW Black, S Watanabe
EACL 2023, 2022
322022
Searchable hidden intermediates for end-to-end models of decomposable sequence tasks
S Dalmia, B Yan, V Raunak, F Metze, S Watanabe
NAACL 2021, 2021
322021
Exploring speech recognition, translation, and understanding with discrete speech units: A comparative study
X Chang, B Yan, K Choi, JW Jung, Y Lu, S Maiti, R Sharma, J Shi, J Tian, ...
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
312024
ESPnet-SE++: Speech enhancement for robust speech recognition, translation, and understanding
YJ Lu, X Chang, C Li, W Zhang, S Cornell, Z Ni, Y Masuyama, B Yan, ...
arXiv preprint arXiv:2207.09514, 2022
282022
BERT meets CTC: New formulation of end-to-end speech recognition with pre-trained masked language model
Y Higuchi, B Yan, S Arora, T Ogawa, T Kobayashi, S Watanabe
EMNLP 2022, 2022
262022
Combining spectral and self-supervised features for low resource speech recognition and translation
D Berrebbi, J Shi, B Yan, O López-Francisco, JD Amith, S Watanabe
arXiv preprint arXiv:2204.02470, 2022
252022
OWSM v3. 1: Better and faster open whisper-style speech models based on e-branchformer
Y Peng, J Tian, W Chen, S Arora, B Yan, Y Sudo, M Shakeel, K Choi, ...
arXiv preprint arXiv:2401.16658, 2024
242024
Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization
B Yan, C Zhang, M Yu, SX Zhang, S Dalmia, D Berrebbi, C Weng, ...
ICASSP 2022, 2022
222022
ESPnet-ST IWSLT 2021 Offline Speech Translation System
H Inaguma, B Yan, S Dalmia, P Gu, J Shi, K Duh, S Watanabe
IWSLT 2021, 2021
202021
Towards zero-shot code-switched speech recognition
B Yan, M Wiesner, O Klejch, P Jyothi, S Watanabe
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
182023
Two-pass low latency end-to-end spoken language understanding
S Arora, S Dalmia, X Chang, B Yan, A Black, S Watanabe
arXiv preprint arXiv:2207.06670, 2022
172022
4D ASR: Joint modeling of CTC, attention, transducer, and mask-predict decoders
Y Sudo, M Shakeel, B Yan, J Shi, S Watanabe
arXiv preprint arXiv:2212.10818, 2022
162022
A comparative study on e-branchformer vs conformer in speech recognition, translation, and understanding tasks
Y Peng, K Kim, F Wu, B Yan, S Arora, W Chen, J Tang, S Shon, P Sridhar, ...
arXiv preprint arXiv:2305.11073, 2023
152023
Token-level sequence labeling for spoken language understanding using compositional end-to-end models
S Arora, S Dalmia, B Yan, F Metze, AW Black, S Watanabe
arXiv preprint arXiv:2210.15734, 2022
142022
ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit
B Yan, J Shi, Y Tang, H Inaguma, Y Peng, S Dalmia, P Polák, ...
ACL 2023, Demo Track, 2023
132023
The system can't perform the operation now. Try again later.
Articles 1–20