Dhiraj Kalamkar

Citeret af

	Alle	Siden 2019
Henvisninger	1881	1436
h-index	19	17
i10-indeks	30	26

360

180

270

20122013201420152016201720182019202020212022202320245 12 44 65 86 95 110 155 231 312 343 316 74

Offentlig adgang

Se alle

5 artikler

1 artikel

tilgængelige

ikke tilgængelige

Baseret på krav i forbindelse med finansiering

Følg

Dhiraj Kalamkar

Andre navneDhiraj D Kalamkar

Intel Corporation

Verificeret mail på intel.com


Titel Sortér efter henvisninger Sortér efter årstal Sortér efter titel	Citeret af Citeret af	År
A study of BFLOAT16 for deep learning training D Kalamkar, D Mudigere, N Mellempudi, D Das, K Banerjee, S Avancha, ... arXiv preprint arXiv:1905.12322, 2019	297	2019
Distributed deep learning using synchronous stochastic gradient descent D Das, S Avancha, D Mudigere, K Vaidynathan, S Sridharan, D Kalamkar, ... arXiv preprint arXiv:1602.06709, 2016	207	2016
Mixed precision training of convolutional neural networks using integer operations D Das, N Mellempudi, D Mudigere, D Kalamkar, S Avancha, K Banerjee, ... arXiv preprint arXiv:1802.00930, 2018	187	2018
Performing power management in a multicore processor VW Lee, ET Grochowski, D Kim, Y Bai, S Li, NK Mellempudi, ... US Patent 10,234,930, 2019	126	2019
Anatomy of high-performance deep learning convolutions on simd architectures E Georganas, S Avancha, K Banerjee, D Kalamkar, G Henry, H Pabst, ... SC18: International Conference for High Performance Computing, Networking …, 2018	122	2018
Distgnn: Scalable distributed training for large-scale graph neural networks V Md, S Misra, G Ma, R Mohanty, E Georganas, A Heinecke, D Kalamkar, ... Proceedings of the International Conference for High Performance Computing …, 2021	99	2021
Optimization of geometric multigrid for emerging multi-and manycore processors S Williams, DD Kalamkar, A Singh, AM Deshpande, B Van Straalen, ... SC'12: Proceedings of the International Conference on High Performance …, 2012	91	2012
Lattice QCD on Intel^® Xeon Phi^TM Coprocessors B Joo, DD Kalamkar, K Vaidyanathan, M Smelyanskiy, K Pamnany, ... Supercomputing: 28th International Supercomputing Conference, ISC 2013 …, 2013	88	2013
Lattice QCD on Intel^® Xeon Phi^TM Coprocessors B Joo, DD Kalamkar, K Vaidyanathan, M Smelyanskiy, K Pamnany, ... Supercomputing: 28th International Supercomputing Conference, ISC 2013 …, 2013	84	2013
Efficient shared-memory implementation of high-performance conjugate gradient benchmark and its application to unstructured matrices J Park, M Smelyanskiy, K Vaidyanathan, A Heinecke, DD Kalamkar, X Liu, ... SC'14: Proceedings of the International Conference for High Performance …, 2014	67	2014
Abstraction layers for scalable distributed machine learning DD Kalamkar, K Vaidyanathan, S Sridharan, D Das US Patent 11,094,029, 2021	66	2021
Improving concurrency and asynchrony in multithreaded MPI applications using software offloading K Vaidyanathan, DD Kalamkar, K Pamnany, JR Hammond, P Balaji, ... Proceedings of the International Conference for High Performance Computing …, 2015	56	2015
Enabling efficient multithreaded MPI communication through a library-based implementation of MPI endpoints S Sridharan, J Dinan, DD Kalamkar SC'14: Proceedings of the International Conference for High Performance …, 2014	56	2014
Lattice qcd with domain decomposition on intel® xeon phi co-processors S Heybrock, B Joó, DD Kalamkar, M Smelyanskiy, K Vaidyanathan, ... SC'14: Proceedings of the International Conference for High Performance …, 2014	48	2014
Optimizing deep learning recommender systems training on cpu cluster architectures D Kalamkar, E Georganas, S Srinivasan, J Chen, M Shiryaev, A Heinecke SC20: International Conference for High Performance Computing, Networking …, 2020	42	2020
Optimizing Wilson-Dirac Operator and Linear Solvers for Intel^® KNL B Joó, DD Kalamkar, T Kurth, K Vaidyanathan, A Walden High Performance Computing: ISC High Performance 2016 International …, 2016	36	2016
On scale-out deep learning training for cloud and hpc S Sridharan, K Vaidyanathan, D Kalamkar, D Das, ME Smorkalov, ... arXiv preprint arXiv:1801.08030, 2018	34	2018
Performing power management in a multicore processor VW Lee, D Kim, Y Bai, S Ji, S Li, DD Kalamkar, NK Mellempudi US Patent 9,910,481, 2018	22	2018
Tensor processing primitives: A programming abstraction for efficiency and portability in deep learning workloads E Georganas, D Kalamkar, S Avancha, M Adelman, C Anderson, A Breuer, ... Proceedings of the International Conference for High Performance Computing …, 2021	20	2021
Harnessing deep learning via a single building block E Georganas, K Banerjee, D Kalamkar, S Avancha, A Venkat, ... 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2020	19	2020

Systemet kan ikke foretage handlingen nu. Prøv igen senere.

Artikler 1–20

Henvisninger pr. år

Dublerede henvisninger

Flettede henvisninger

Tilføj medforfattereMedforfattere

Følg

Citeret af