Follow
Arsha Nagrani
Arsha Nagrani
Research Scientist, Google
Verified email at google.com - Homepage
Title
Cited by
Cited by
Year
Voxceleb: a large-scale speaker identification dataset
A Nagrani, JS Chung, A Zisserman
arXiv preprint arXiv:1706.08612, 2017
23332017
Voxceleb2: Deep speaker recognition
JS Chung, A Nagrani, A Zisserman
arXiv preprint arXiv:1806.05622, 2018
21732018
Frozen in time: A joint video and image encoder for end-to-end retrieval
M Bain, A Nagrani, G Varol, A Zisserman
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021
6852021
Voxceleb: Large-scale speaker verification in the wild
A Nagrani, JS Chung, W Xie, A Zisserman
Computer Speech & Language 60, 101027, 2020
6022020
Attention bottlenecks for multimodal fusion
A Nagrani, S Yang, A Arnab, A Jansen, C Schmid, C Sun
Advances in neural information processing systems 34, 14200-14213, 2021
4282021
Use what you have: Video retrieval using representations from collaborative experts
Y Liu, S Albanie, A Nagrani, A Zisserman
arXiv preprint arXiv:1907.13487, 2019
3922019
Utterance-level aggregation for speaker recognition in the wild
W Xie, A Nagrani, JS Chung, A Zisserman
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
3822019
Epic-fusion: Audio-visual temporal binding for egocentric action recognition
E Kazakos, A Nagrani, A Zisserman, D Damen
Proceedings of the IEEE/CVF international conference on computer vision …, 2019
3452019
Emotion recognition in speech using cross-modal transfer in the wild
S Albanie, A Nagrani, A Vedaldi, A Zisserman
Proceedings of the 26th ACM international conference on Multimedia, 292-301, 2018
2922018
Seeing voices and hearing faces: Cross-modal biometric matching
A Nagrani, S Albanie, A Zisserman
Proceedings of the IEEE conference on computer vision and pattern …, 2018
2232018
Chimpanzee face recognition from videos in the wild using deep learning
D Schofield, A Nagrani, A Zisserman, M Hayashi, T Matsuzawa, D Biro, ...
Science advances 5 (9), eaaw0736, 2019
1792019
Localizing visual sounds the hard way
H Chen, W Xie, T Afouras, A Nagrani, A Vedaldi, A Zisserman
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2021
1432021
Learnable pins: Cross-modal embeddings for person identity
A Nagrani, S Albanie, A Zisserman
Proceedings of the European Conference on Computer Vision (ECCV), 71-88, 2018
1342018
Spot the conversation: speaker diarisation in the wild
JS Chung, J Huh, A Nagrani, T Afouras, A Zisserman
arXiv preprint arXiv:2007.01216, 2020
1292020
End-to-end generative pretraining for multimodal video captioning
PH Seo, A Nagrani, A Arnab, C Schmid
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
1282022
Cough against covid: Evidence of covid-19 signature in cough sounds
P Bagad, A Dalmia, J Doshi, A Nagrani, P Bhamare, A Mahale, S Rane, ...
arXiv preprint arXiv:2009.08790, 2020
1272020
Disentangled speech embeddings using cross-modal self-supervision
A Nagrani, JS Chung, S Albanie, A Zisserman
ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020
952020
Vid2seq: Large-scale pretraining of a visual language model for dense video captioning
A Yang, A Nagrani, PH Seo, A Miech, J Pont-Tuset, I Laptev, J Sivic, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
792023
Voxsrc 2020: The second voxceleb speaker recognition challenge
A Nagrani, JS Chung, J Huh, A Brown, E Coto, W Xie, M McLaren, ...
arXiv preprint arXiv:2012.06867, 2020
782020
Condensed movies: Story based retrieval with contextual embeddings
M Bain, A Nagrani, A Brown, A Zisserman
Proceedings of the Asian Conference on Computer Vision, 2020
762020
The system can't perform the operation now. Try again later.
Articles 1–20