Follow
Wei-Ning Hsu
Wei-Ning Hsu
Facebook AI Research (FAIR)
Verified email at csail.mit.edu - Homepage
Title
Cited by
Cited by
Year
Hubert: Self-supervised speech representation learning by masked prediction of hidden units
WN Hsu, B Bolte, YHH Tsai, K Lakhotia, R Salakhutdinov, A Mohamed
IEEE/ACM Transactions on Audio, Speech, and Language Processing 29, 3451-3460, 2021
17122021
Data2vec: A general framework for self-supervised learning in speech, vision and language
A Baevski, WN Hsu, Q Xu, A Babu, J Gu, M Auli
International Conference on Machine Learning, 1298-1312, 2022
5812022
An unsupervised autoregressive model for speech representation learning
YA Chung, WN Hsu, H Tang, J Glass
INTERSPEECH, 2019
4212019
Unsupervised learning of disentangled and interpretable representations from sequential data
WN Hsu, Y Zhang, J Glass
Thirty-first Conference on Neural Information Processing Systems (NeurIPS), 2017
3842017
Hierarchical generative modeling for controllable speech synthesis
WN Hsu, Y Zhang, RJ Weiss, H Zen, Y Wu, Y Wang, Y Cao, Y Jia, Z Chen, ...
Seventh International Conference on Learning Representations (ICLR), 2019
282*2019
Unsupervised speech recognition
A Baevski, WN Hsu, A Conneau, M Auli
Advances in Neural Information Processing Systems 34, 27826-27839, 2021
2502021
On generative spoken language modeling from raw audio
K Lakhotia, E Kharitonov, WN Hsu, Y Adi, A Polyak, B Bolte, TA Nguyen, ...
Transactions of the Association for Computational Linguistics 9, 1336-1354, 2021
2142021
Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training
WN Hsu, A Sriram, A Baevski, T Likhomanenko, Q Xu, V Pratap, J Kahn, ...
INTERSPEECH, 2021
2012021
Speech Resynthesis from Discrete Disentangled Self-Supervised Representations
A Polyak, Y Adi, J Copet, E Kharitonov, K Lakhotia, WN Hsu, A Mohamed, ...
INTERSPEECH, 2021
1962021
Lingvo: a modular and scalable framework for sequence-to-sequence modeling
J Shen, P Nguyen, Y Wu, Z Chen, MX Chen, Y Jia, A Kannan, T Sainath, ...
arXiv preprint arXiv:1902.08295, 2019
1932019
Learning audio-visual speech representation by masked multimodal cluster prediction
B Shi, WN Hsu, K Lakhotia, A Mohamed
arXiv preprint arXiv:2201.02184, 2022
1742022
Learning Latent Representations for Speech Generation and Transformation
WN Hsu, Y Zhang, J Glass
INTERSPEECH, 1273-1277, 2017
1732017
Active learning by learning
WN Hsu, HT Lin
Proceedings of the AAAI Conference on Artificial Intelligence 29 (1), 2015
1732015
Unsupervised domain adaptation for robust speech recognition via variational autoencoder-based data augmentation
WN Hsu, Y Zhang, J Glass
2017 IEEE automatic speech recognition and understanding workshop (ASRU), 16-23, 2017
1562017
Semi-supervised training for improving data efficiency in end-to-end speech synthesis
YA Chung, Y Wang, WN Hsu, Y Zhang, RJ Skerry-Ryan
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
1352019
Disentangling correlated speaker and noise for speech synthesis via data augmentation and adversarial factorization
WN Hsu, Y Zhang, RJ Weiss, YA Chung, Y Wang, Y Wu, J Glass
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
1222019
Direct speech-to-speech translation with discrete units
A Lee, PJ Chen, C Wang, J Gu, S Popuri, X Ma, A Polyak, Y Adi, Q He, ...
arXiv preprint arXiv:2107.05604, 2021
1042021
Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech
D Harwath, WN Hsu, J Glass
Eighth International Conference on Learning Representations (ICLR), 2020
922020
Multi-channel speech recognition: LSTMs all the way through
H Erdogan, T Hayashi, JR Hershey, T Hori, C Hori, WN Hsu, S Kim, ...
CHiME-4 workshop, 1-4, 2016
882016
Textless speech-to-speech translation on real data
A Lee, H Gong, PA Duquenne, H Schwenk, PJ Chen, C Wang, S Popuri, ...
arXiv preprint arXiv:2112.08352, 2021
832021
The system can't perform the operation now. Try again later.
Articles 1–20