Følg
Dhiraj Kalamkar
Dhiraj Kalamkar
Andre navneDhiraj D Kalamkar
Verificeret mail på intel.com
Titel
Citeret af
Citeret af
År
A study of BFLOAT16 for deep learning training
D Kalamkar, D Mudigere, N Mellempudi, D Das, K Banerjee, S Avancha, ...
arXiv preprint arXiv:1905.12322, 2019
2972019
Distributed deep learning using synchronous stochastic gradient descent
D Das, S Avancha, D Mudigere, K Vaidynathan, S Sridharan, D Kalamkar, ...
arXiv preprint arXiv:1602.06709, 2016
2072016
Mixed precision training of convolutional neural networks using integer operations
D Das, N Mellempudi, D Mudigere, D Kalamkar, S Avancha, K Banerjee, ...
arXiv preprint arXiv:1802.00930, 2018
1872018
Performing power management in a multicore processor
VW Lee, ET Grochowski, D Kim, Y Bai, S Li, NK Mellempudi, ...
US Patent 10,234,930, 2019
1262019
Anatomy of high-performance deep learning convolutions on simd architectures
E Georganas, S Avancha, K Banerjee, D Kalamkar, G Henry, H Pabst, ...
SC18: International Conference for High Performance Computing, Networking …, 2018
1222018
Distgnn: Scalable distributed training for large-scale graph neural networks
V Md, S Misra, G Ma, R Mohanty, E Georganas, A Heinecke, D Kalamkar, ...
Proceedings of the International Conference for High Performance Computing …, 2021
992021
Optimization of geometric multigrid for emerging multi-and manycore processors
S Williams, DD Kalamkar, A Singh, AM Deshpande, B Van Straalen, ...
SC'12: Proceedings of the International Conference on High Performance …, 2012
912012
Lattice QCD on Intel® Xeon PhiTM Coprocessors
B Joo, DD Kalamkar, K Vaidyanathan, M Smelyanskiy, K Pamnany, ...
Supercomputing: 28th International Supercomputing Conference, ISC 2013 …, 2013
882013
Lattice QCD on Intel® Xeon PhiTM Coprocessors
B Joo, DD Kalamkar, K Vaidyanathan, M Smelyanskiy, K Pamnany, ...
Supercomputing: 28th International Supercomputing Conference, ISC 2013 …, 2013
842013
Efficient shared-memory implementation of high-performance conjugate gradient benchmark and its application to unstructured matrices
J Park, M Smelyanskiy, K Vaidyanathan, A Heinecke, DD Kalamkar, X Liu, ...
SC'14: Proceedings of the International Conference for High Performance …, 2014
672014
Abstraction layers for scalable distributed machine learning
DD Kalamkar, K Vaidyanathan, S Sridharan, D Das
US Patent 11,094,029, 2021
662021
Improving concurrency and asynchrony in multithreaded MPI applications using software offloading
K Vaidyanathan, DD Kalamkar, K Pamnany, JR Hammond, P Balaji, ...
Proceedings of the International Conference for High Performance Computing …, 2015
562015
Enabling efficient multithreaded MPI communication through a library-based implementation of MPI endpoints
S Sridharan, J Dinan, DD Kalamkar
SC'14: Proceedings of the International Conference for High Performance …, 2014
562014
Lattice qcd with domain decomposition on intel® xeon phi co-processors
S Heybrock, B Joó, DD Kalamkar, M Smelyanskiy, K Vaidyanathan, ...
SC'14: Proceedings of the International Conference for High Performance …, 2014
482014
Optimizing deep learning recommender systems training on cpu cluster architectures
D Kalamkar, E Georganas, S Srinivasan, J Chen, M Shiryaev, A Heinecke
SC20: International Conference for High Performance Computing, Networking …, 2020
422020
Optimizing Wilson-Dirac Operator and Linear Solvers for Intel® KNL
B Joó, DD Kalamkar, T Kurth, K Vaidyanathan, A Walden
High Performance Computing: ISC High Performance 2016 International …, 2016
362016
On scale-out deep learning training for cloud and hpc
S Sridharan, K Vaidyanathan, D Kalamkar, D Das, ME Smorkalov, ...
arXiv preprint arXiv:1801.08030, 2018
342018
Performing power management in a multicore processor
VW Lee, D Kim, Y Bai, S Ji, S Li, DD Kalamkar, NK Mellempudi
US Patent 9,910,481, 2018
222018
Tensor processing primitives: A programming abstraction for efficiency and portability in deep learning workloads
E Georganas, D Kalamkar, S Avancha, M Adelman, C Anderson, A Breuer, ...
Proceedings of the International Conference for High Performance Computing …, 2021
202021
Harnessing deep learning via a single building block
E Georganas, K Banerjee, D Kalamkar, S Avancha, A Venkat, ...
2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2020
192020
Systemet kan ikke foretage handlingen nu. Prøv igen senere.
Artikler 1–20