Hifi-codec: Group-residual vector quantization for high fidelity audio codec D Yang, S Liu, R Huang, J Tian, C Weng, Y Zou arXiv preprint arXiv:2305.02765, 2023 | 106 | 2023 |
Uniaudio: An audio foundation model toward universal audio generation D Yang, J Tian, X Tan, R Huang, S Liu, X Chang, J Shi, S Zhao, J Bian, ... arXiv preprint arXiv:2310.00704, 2023 | 96 | 2023 |
Reproducing whisper-style training using an open-source toolkit and publicly available data Y Peng, J Tian, B Yan, D Berrebbi, X Chang, X Li, J Shi, S Arora, W Chen, ... 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-8, 2023 | 41 | 2023 |
Exploring speech recognition, translation, and understanding with discrete speech units: A comparative study X Chang, B Yan, K Choi, JW Jung, Y Lu, S Maiti, R Sharma, J Shi, J Tian, ... ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 35 | 2024 |
OWSM v3. 1: Better and faster open whisper-style speech models based on e-branchformer Y Peng, J Tian, W Chen, S Arora, B Yan, Y Sudo, M Shakeel, K Choi, ... arXiv preprint arXiv:2401.16658, 2024 | 30 | 2024 |
LAE: Language-aware encoder for monolingual and multilingual asr J Tian, J Yu, C Zhang, C Weng, Y Zou, D Yu Interspeech 2022, 2022 | 23 | 2022 |
Consistent training and decoding for end-to-end speech recognition using lattice-free MMI J Tian, J Yu, C Weng, SX Zhang, D Su, D Yu, Y Zou ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 14 | 2022 |
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units X Chang, J Shi, J Tian, Y Wu, Y Tang, Y Wu, S Watanabe, Y Adi, X Chen, ... arXiv preprint arXiv:2406.07725, 2024 | 13 | 2024 |
Improving Mandarin End-to-End Speech Recognition with Word N-gram Language Model J Tian, J Yu, C Weng, Y Zou, D Yu IEEE Signal Processing Letters 29, 812-816, 2022 | 12 | 2022 |
Integrating Lattice-Free MMI into End-to-End Speech Recognition J Tian, J Yu, C Weng, Y Zou, D Yu IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2022 | 11* | 2022 |
Bayes risk CTC: Controllable CTC alignment in Sequence-to-Sequence tasks J Tian, B Yan, J Yu, C Weng, D Yu, S Watanabe International Conference on Learning Representations (ICLR) 2023, 2022 | 10 | 2022 |
AutoPrep: An Automatic Preprocessing Framework for In-The-Wild Speech Data J Yu, H Chen, Y Bian, X Li, Y Luo, J Tian, M Liu, J Jiang, S Wang ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 7 | 2024 |
Espnet-codec: Comprehensive training and evaluation of neural codecs for audio, music, and speech J Shi, J Tian, Y Wu, J Jung, JQ Yip, Y Masuyama, W Chen, Y Wu, Y Tang, ... 2024 IEEE Spoken Language Technology Workshop (SLT), 562-569, 2024 | 6 | 2024 |
UniAudio: Towards Universal Audio Generation with Large Language Models D Yang, J Tian, X Tan, R Huang, S Liu, H Guo, X Chang, J Shi, J Bian, ... Forty-first International Conference on Machine Learning, 0 | 6 | |
Towards robust speech representation learning for thousands of languages W Chen, W Zhang, Y Peng, X Li, J Tian, J Shi, X Chang, S Maiti, K Livescu, ... arXiv preprint arXiv:2407.00837, 2024 | 5 | 2024 |
Make-a-voice: Revisiting voice large language models as scalable multilingual and multitask learners R Huang, C Zhang, Y Wang, D Yang, J Tian, Z Ye, L Liu, Z Wang, Z Jiang, ... Proceedings of the 62nd Annual Meeting of the Association for Computational …, 2024 | 4 | 2024 |
On the Effects of Heterogeneous Data Sources on Speech-to-Text Foundation Models J Tian, Y Peng, W Chen, K Choi, K Livescu, S Watanabe arXiv preprint arXiv:2406.09282, 2024 | 4 | 2024 |
ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets J Shi, SH Wang, W Chen, M Bartelds, VB Kumar, J Tian, X Chang, ... arXiv preprint arXiv:2406.08641, 2024 | 3 | 2024 |
Speaker-Aware Mixture of Mixtures Training for Weakly Supervised Speaker Extraction Z Zhao, R Gu, D Yang, J Tian, Y Zou Interspeech 2022, 2022 | 3 | 2022 |
A random gossip BMUF process for neural language modeling Y Huang, J Tian, L Han, G Wang, X Song, D Su, D Yu ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 3 | 2020 |