Training compute-optimal large language models J Hoffmann, S Borgeaud, A Mensch, E Buchatskaya, T Cai, E Rutherford, ... arXiv preprint arXiv:2203.15556, 2022 | 2192* | 2022 |
Scaling language models: Methods, analysis & insights from training gopher JW Rae, S Borgeaud, T Cai, K Millican, J Hoffmann, F Song, J Aslanides, ... arXiv preprint arXiv:2112.11446, 2021 | 1249* | 2021 |
Infograph: Unsupervised and semi-supervised graph-level representation learning via mutual information maximization FY Sun, J Hoffmann, V Verma, J Tang arXiv preprint arXiv:1908.01000, 2019 | 1079 | 2019 |
Improving language models by retrieving from trillions of tokens S Borgeaud, A Mensch, J Hoffmann, T Cai, E Rutherford, K Millican, ... International conference on machine learning, 2206-2240, 2022 | 1010 | 2022 |
Recurrent independent mechanisms A Goyal, A Lamb, J Hoffmann, S Sodhani, S Levine, Y Bengio, ... arXiv preprint arXiv:1909.10893, 2019 | 366 | 2019 |
Unified scaling laws for routed language models A Clark, D de Las Casas, A Guy, A Mensch, M Paganini, J Hoffmann, ... International conference on machine learning, 4057-4086, 2022 | 152* | 2022 |
An empirical analysis of compute-optimal large language model training J Hoffmann, S Borgeaud, A Mensch, E Buchatskaya, T Cai, E Rutherford, ... Advances in Neural Information Processing Systems 35, 30016-30030, 2022 | 141 | 2022 |
Reconnaissance of the HR 8799 exosolar system. II. Astrometry and orbital motion L Pueyo, R Soummer, J Hoffmann, R Oppenheimer, JR Graham, ... The Astrophysical Journal 803 (1), 31, 2015 | 120 | 2015 |
vgraph: A generative model for joint community detection and node representation learning FY Sun, M Qu, J Hoffmann, CW Huang, J Tang Advances in Neural Information Processing Systems 32, 2019 | 110 | 2019 |
Data-driven approach to encoding and decoding 3-d crystal structures J Hoffmann, L Maestrati, Y Sawada, J Tang, JM Sellier, Y Bengio arXiv preprint arXiv:1909.00949, 2019 | 87 | 2019 |
Machine learning in a data-limited regime: Augmenting experiments with synthetic data uncovers order in crumpled sheets J Hoffmann, Y Bar-Sinai, LM Lee, J Andrejevic, S Mishra, SM Rubinstein, ... Science advances 5 (4), eaau6792, 2019 | 75 | 2019 |
A systematic investigation of commonsense knowledge in large language models XL Li, A Kuncoro, J Hoffmann, CM d'Autume, P Blunsom, A Nematzadeh arXiv preprint arXiv:2111.00607, 2021 | 61 | 2021 |
Ion correlations in nanofluidic channels: Effects of ion size, valence, and concentration on voltage-and pressure-driven currents J Hoffmann, D Gillespie Langmuir 29 (4), 1303-1317, 2013 | 58 | 2013 |
A simple developmental model recapitulates complex insect wing venation patterns J Hoffmann, S Donoughe, K Li, MK Salcedo, CH Rycroft Proceedings of the National Academy of Sciences 115 (40), 9905-9910, 2018 | 49 | 2018 |
Computational analysis of size, shape and structure of insect wings MK Salcedo, J Hoffmann, S Donoughe, L Mahadevan Biology Open 8 (10), bio040774, 2019 | 46 | 2019 |
Training compute-optimal large language models. arXiv 2022 J Hoffmann, S Borgeaud, A Mensch, E Buchatskaya, T Cai, E Rutherford, ... arXiv preprint arXiv:2203.15556 10, 2022 | 33 | 2022 |
Training compute-optimal large language models. arXiv J Hoffmann, S Borgeaud, A Mensch, E Buchatskaya, T Cai, E Rutherford, ... arXiv preprint arXiv:2203.15556, 2022 | 32 | 2022 |
Nuclear speed and cycle length co-vary with local density during syncytial blastoderm formation in a cricket S Donoughe, J Hoffmann, T Nakamura, CH Rycroft, CG Extavour Nature communications 13 (1), 3889, 2022 | 22* | 2022 |
The role of negative selection in protein evolution revealed through the energetics of the native state ensemble J Hoffmann, JO Wrabl, VJ Hilser Proteins: Structure, Function, and Bioinformatics 84 (4), 435-447, 2016 | 21 | 2016 |
Training compute-optimal large language models (2022) J Hoffmann, S Borgeaud, A Mensch, E Buchatskaya, T Cai, E Rutherford, ... arXiv preprint arXiv:2203.15556, 2022 | 18 | 2022 |