Follow
Amit Sabne
Title
Cited by
Cited by
Year
SALSA: Systematic logic synthesis of approximate circuits
S Venkataramani, A Sabne, V Kozhikkottu, K Roy, A Raghunathan
Proceedings of the 49th Annual Design Automation Conference, 796-801, 2012
4082012
A learned performance model for tensor processing units
S Kaufman, P Phothilimthana, Y Zhou, C Mendis, S Roy, A Sabne, ...
Proceedings of Machine Learning and Systems 3, 387-400, 2021
762021
High performance model based image reconstruction
X Wang, A Sabne, S Kisner, A Raghunathan, C Bouman, S Midkiff
ACM SIGPLAN Notices 51 (8), 1-12, 2016
702016
Fast distributed bandits for online recommendation systems
K Mahadik, Q Wu, S Li, A Sabne
Proceedings of the 34th ACM international conference on supercomputing, 1-13, 2020
672020
Xla: Compiling machine learning for peak performance
A Sabne
Google Res, 2020
612020
Pagoda: Fine-grained gpu resource virtualization for narrow tasks
TT Yeh, A Sabne, P Sakdhnagool, R Eigenmann, TG Rogers
ACM SIGPLAN Notices 52 (8), 221-234, 2017
592017
Model-based iterative CT image reconstruction on GPUs
A Sabne, X Wang, SJ Kisner, CA Bouman, A Raghunathan, SP Midkiff
ACM SIGPLAN Notices 52 (8), 207-220, 2017
472017
Overlap communication with dependent computation via decomposition in large deep learning models
S Wang, J Wei, A Sabne, A Davis, B Ilbeyi, B Hechtman, D Chen, ...
Proceedings of the 28th ACM International Conference on Architectural …, 2022
432022
Massively parallel 3D image reconstruction
X Wang, A Sabne, P Sakdhnagool, SJ Kisner, CA Bouman, SP Midkiff
Proceedings of the International Conference for High Performance Computing …, 2017
422017
Evaluating performance portability of OpenACC
A Sabne, P Sakdhnagool, S Lee, JS Vetter
Languages and Compilers for Parallel Computing: 27th International Workshop …, 2015
422015
Scaling large-data computations on multi-GPU accelerators
A Sabne, P Sakdhnagool, R Eigenmann
Proceedings of the 27th international ACM conference on International …, 2013
352013
Heterodoop: A mapreduce programming system for accelerator clusters
A Sabne, P Sakdhnagool, R Eigenmann
Proceedings of the 24th International Symposium on High-Performance Parallel …, 2015
282015
A flexible approach to autotuning multi-pass machine learning compilers
PM Phothilimthana, A Sabne, N Sarda, KS Murthy, Y Zhou, ...
2021 30th International Conference on Parallel Architectures and Compilation …, 2021
272021
A generic low power scan chain wrapper for designs using scan compression
A Sabne, R Tiwari, A Shrivastava, S Ravi, R Parekhji
2010 28th VLSI Test Symposium (VTS), 135-140, 2010
212010
Logic synthesis of approximate circuits
S Venkataramani, VJ Kozhikkottu, A Sabne, K Roy, A Raghunathan
IEEE Transactions on Computer-Aided Design of Integrated Circuits and …, 2019
202019
Confluence analysis and loop fast-forwarding for improving SIMD execution efficiency
AJ Sabne, Y Lin, V Grover
US Patent 9,612,811, 2017
202017
Understanding portability of a high-level programming model on contemporary heterogeneous architectures
A Sabne, P Sakdhnagool, S Lee, JS Vetter
IEEE Micro 35 (4), 48-58, 2015
162015
System and method for compiling or runtime executing a fork-join data parallel program with function calls on a single-instruction-multiple-thread processor
Y Lin, G Chakrabarti, J Marathe, O Kwon, A Sabne
US Patent 9,747,107, 2017
142017
Xla: Compiling machine learning for peak performance.(2020)
A Sabne
There is no corresponding record for this reference, 2020
132020
System and method for executing sequential code using a group of threads and single-instruction, multiple-thread processor incorporating the same
G Chakrabarti, Y Lin, J Marathe, O Kwon, A Sabne
US Patent 9,436,475, 2016
132016
The system can't perform the operation now. Try again later.
Articles 1–20