Følg
Sheng Shen
Sheng Shen
Verifisert e-postadresse på berkeley.edu - Startside
Tittel
Sitert av
Sitert av
År
Multitask prompted training enables zero-shot task generalization
V Sanh, A Webson, C Raffel, SH Bach, L Sutawika, Z Alyafeai, A Chaffin, ...
ICLR 2022, 2021
15762021
Bloom: A 176b-parameter open-access multilingual language model
T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ...
15062023
Q-bert: Hessian based ultra low precision quantization of bert
S Shen, Z Dong, J Ye, L Ma, Z Yao, A Gholami, MW Mahoney, K Keutzer
AAAI 2020, 2019
5642019
Crosslingual generalization through multitask finetuning
N Muennighoff, T Wang, L Sutawika, A Roberts, S Biderman, TL Scao, ...
ACL 2023, 2022
5632022
How Much Can CLIP Benefit Vision-and-Language Tasks?
S Shen*, LH Li*, H Tan, M Bansal, A Rohrbach, KW Chang, Z Yao, ...
ICLR 2022, 2021
4032021
The llama 3 herd of models
A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, A Letman, A Mathur, ...
arXiv preprint arXiv:2407.21783, 2024
4012024
Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers
Z Li*, E Wallace*, S Shen*, K Lin*, K Keutzer, D Klein, JE Gonzalez
ICML 2020, 2020
2922020
Agentbench: Evaluating llms as agents
X Liu, H Yu, H Zhang, Y Xu, X Lei, H Lai, Y Gu, H Ding, K Men, K Yang, ...
arXiv preprint arXiv:2308.03688, 2023
257*2023
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
Z Yao, A Gholami, S Shen, K Keutzer, MW Mahoney
AAAI 2021, 2020
2572020
Llava-next: Improved reasoning, ocr, and world knowledge
H Liu, C Li, Y Li, B Li, Y Zhang, S Shen, YJ Lee
1422024
Aligning large multimodal models with factually augmented rlhf
Z Sun*, S Shen*, S Cao*, H Liu, C Li, Y Shen, C Gan, LY Gui, YX Wang, ...
arXiv preprint arXiv:2309.14525, 2023
1372023
Learned token pruning for transformers
S Kim*, S Shen*, D Thorsley, A Gholami, W Kwon, J Hassoun, K Keutzer
KDD 2022, 2021
1252021
Poisoning Language Models During Instruction Tuning
A Wan*, E Wallace*, S Shen, D Klein
ICML 2023, 2023
1132023
SqueezeLLM: Dense-and-Sparse Quantization
S Kim*, C Hooper*, A Gholami*, Z Dong, X Li, S Shen, MW Mahoney, ...
arXiv preprint arXiv:2306.07629, 2023
1092023
An annotated dataset of literary entities
D Bamman, S Popat, S Shen
NAACL 2019, 2019
1032019
What Language Model to Train if You Have One Million GPU Hours?
T Le Scao, T Wang, D Hesslow, L Saulnier, S Bekman, MS Bari, ...
EMNLP 2022, 2022
962022
Powernorm: Rethinking batch normalization in transformers
S Shen, Z Yao, A Gholami, M Mahoney, K Keutzer
ICML 2020, 2020
892020
Ermes: Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification
Z Chen*, S Shen*, Z Hu, X Lu, Q Mei, X Liu
WWW 2019, 2018
87*2018
K-lite: Learning transferable visual models with external knowledge
S Shen, C Li, X Hu, Y Xie, J Yang, P Zhang, A Rohrbach, Z Gan, L Wang, ...
NeurIPS 2022, 2022
832022
Mixture-of-Experts Meets Instruction Tuning: A Winning Combination for Large Language Models
S Shen, L Hou, Y Zhou, N Du, S Longpre, J Wei, HW Chung, B Zoph, ...
ICLR 2024, 2023
72*2023
Systemet kan ikke utføre handlingen. Prøv på nytt senere.
Artikler 1–20