HPCToolkit: Tools for performance analysis of optimized parallel programs L Adhianto, S Banerjee, M Fagan, M Krentel, G Marin, J Mellor‐Crummey, ... Concurrency and Computation: Practice and Experience 22 (6), 685-701, 2010 | 941 | 2010 |
Evaluating modern gpu interconnect: Pcie, nvlink, nv-sli, nvswitch and gpudirect A Li, SL Song, J Chen, J Li, X Liu, NR Tallent, KJ Barker IEEE Transactions on Parallel and Distributed Systems 31 (1), 94-110, 2019 | 278 | 2019 |
OpenAD/F: A modular open-source tool for automatic differentiation of Fortran codes J Utke, U Naumann, M Fagan, N Tallent, M Strout, P Heimbach, C Hill, ... ACM Transactions on Mathematical Software (TOMS) 34 (4), 1-36, 2008 | 192 | 2008 |
HPCView: A tool for top-down analysis of node performance J Mellor-Crummey, RJ Fowler, G Marin, N Tallent The Journal of Supercomputing 23, 81-104, 2002 | 168 | 2002 |
Analyzing lock contention in multithreaded applications NR Tallent, JM Mellor-Crummey, A Porterfield Proceedings of the 15th ACM SIGPLAN symposium on Principles and practice of …, 2010 | 157 | 2010 |
Effective performance measurement and analysis of multithreaded applications NR Tallent, JM Mellor-Crummey Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of …, 2009 | 151 | 2009 |
Scalable identification of load imbalance in parallel executions using call path profiles NR Tallent, L Adhianto, JM Mellor-Crummey SC'10: Proceedings of the 2010 ACM/IEEE International Conference for High …, 2010 | 99 | 2010 |
Perfect (power efficiency revolution for embedded computing technologies) benchmark suite manual K Barker, T Benson, D Campbell, D Ediger, R Gioiosa, A Hoisie, ... Pacific Northwest National Laboratory and Georgia Tech Research Institute, 2013 | 85 | 2013 |
Palm: Easing the burden of analytical performance modeling NR Tallent, A Hoisie Proceedings of the 28th ACM international conference on Supercomputing, 221-230, 2014 | 84 | 2014 |
Binary analysis for measurement and attribution of program performance NR Tallent, JM Mellor-Crummey, MW Fagan ACM Sigplan Notices 44 (6), 441-452, 2009 | 80 | 2009 |
Tartan: evaluating modern GPU interconnect via a multi-GPU benchmark suite A Li, SL Song, J Chen, X Liu, N Tallent, K Barker 2018 IEEE International Symposium on Workload Characterization (IISWC), 191-202, 2018 | 68 | 2018 |
Scaling deep learning workloads: Nvidia dgx-1/pascal and intel knights landing NA Gawande, JA Daily, C Siegel, NR Tallent, A Vishnu Future Generation Computer Systems 108, 1162-1172, 2020 | 59 | 2020 |
HPCToolkit: performance tools for scientific computing N Tallent, J Mellor-Crummey, L Adhianto, M Fagan, M Krentel Journal of Physics: Conference Series 125 (1), 012088, 2008 | 58 | 2008 |
Scalable fine-grained call path tracing NR Tallent, J Mellor-Crummey, M Franco, R Landrum, L Adhianto Proceedings of the international conference on Supercomputing, 63-74, 2011 | 52 | 2011 |
Diagnosing performance bottlenecks in emerging petascale applications NR Tallent, JM Mellor-Crummey, L Adhianto, MW Fagan, M Krentel Proceedings of the Conference on High Performance Computing Networking …, 2009 | 51 | 2009 |
A case for application-oblivious energy-efficient MPI runtime A Venkatesh, A Vishnu, K Hamidouche, N Tallent, D Panda, D Kerbyson, ... Proceedings of the international conference for high performance computing …, 2015 | 46 | 2015 |
Evaluating on-node gpu interconnects for deep learning workloads NR Tallent, NA Gawande, C Siegel, A Vishnu, A Hoisie High Performance Computing Systems. Performance Modeling, Benchmarking, and …, 2018 | 37 | 2018 |
Hpctoolkit: tools for performance analysis of optimized parallel programs http://hpctoolkit. org. Concurr. Comput.: Pract. Exper., 22: 685–701 L Adhianto, S Banerjee, M Fagan, M Krentel, G Marin, J Mellor-Crummey, ... Google Scholar Google Scholar Digital Library Digital Library, 2010 | 30 | 2010 |
Fault modeling of extreme scale applications using machine learning A Vishnu, H Van Dam, NR Tallent, DJ Kerbyson, A Hoisie 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2016 | 29 | 2016 |
Vertex reordering for real-world graphs and applications: An empirical evaluation R Barik, M Minutoli, M Halappanavar, NR Tallent, A Kalyanaraman 2020 IEEE International Symposium on Workload Characterization (IISWC), 240-251, 2020 | 26 | 2020 |