On layer normalization in the transformer architecture R Xiong, Y Yang, D He, K Zheng, S Zheng, C Xing, H Zhang, Y Lan, ... International Conference on Machine Learning, 10524-10533, 2020 | 190 | 2020 |
Reshaped wirtinger flow for solving quadratic system of equations H Zhang, Y Liang Advances in Neural Information Processing Systems 29, 2016 | 126 | 2016 |
Provable non-convex phase retrieval with outliers: Median truncatedwirtinger flow H Zhang, Y Chi, Y Liang International conference on machine learning, 1022-1031, 2016 | 97 | 2016 |
A nonconvex approach for phase retrieval: Reshaped wirtinger flow and incremental algorithms H Zhang, Y Zhou, Y Liang, Y Chi Journal of Machine Learning Research 18, 2017 | 91 | 2017 |
Sgd converges to global minimum in deep learning via star-convex path Y Zhou, J Yang, H Zhang, Y Liang, V Tarokh arXiv preprint arXiv:1901.00451, 2019 | 44 | 2019 |
Reshaped Wirtinger flow and incremental algorithm for solving quadratic system of equations H Zhang, Y Zhou, Y Liang, Y Chi arXiv preprint arXiv:1605.07719, 2016 | 28 | 2016 |
Block-diagonal hessian-free optimization for recurrent and convolutional neural networks H Zhang, C Xiong US Patent App. 15/983,782, 2018 | 27 | 2018 |
Median-truncated nonconvex approach for phase retrieval with outliers H Zhang, Y Chi, Y Liang IEEE Transactions on information Theory 64 (11), 7287-7310, 2018 | 27 | 2018 |
The capacity region of the source-type model for secret key and private key generation H Zhang, L Lai, Y Liang, H Wang IEEE Transactions on Information Theory 60 (10), 6389-6398, 2014 | 27 | 2014 |
Non-convex low-rank matrix recovery with arbitrary outliers via median-truncated gradient descent Y Li, Y Chi, H Zhang, Y Liang Information and Inference: A Journal of the IMA 9 (2), 289-325, 2020 | 23 | 2020 |
-SGD: Optimizing ReLU Neural Networks in its Positively Scale-Invariant Space Q Meng, S Zheng, H Zhang, W Chen, ZM Ma, TY Liu arXiv preprint arXiv:1802.03713, 2018 | 19 | 2018 |
Geometrical properties and accelerated gradient solvers of non-convex phase retrieval Y Zhou, H Zhang, Y Liang 2016 54th Annual Allerton Conference on Communication, Control, and …, 2016 | 19 | 2016 |
Generalization error bounds with probabilistic guarantee for SGD in nonconvex optimization Y Zhou, Y Liang, H Zhang arXiv preprint arXiv:1802.06903, 2018 | 17 | 2018 |
Capacity control of ReLU neural networks by basis-path norm S Zheng, Q Meng, H Zhang, W Chen, N Yu, TY Liu Proceedings of the AAAI Conference on Artificial Intelligence 33 (01), 5925-5932, 2019 | 16 | 2019 |
Multi-key generation over a cellular model with a helper H Zhang, Y Liang, L Lai, SS Shitz IEEE Transactions on Information Theory 63 (6), 3804-3822, 2017 | 16 | 2017 |
Do not let privacy overbill utility: Gradient embedding perturbation for private learning D Yu, H Zhang, W Chen, TY Liu arXiv preprint arXiv:2102.12677, 2021 | 15 | 2021 |
Gradient perturbation is underrated for differentially private convex optimization D Yu, H Zhang, W Chen, TY Liu, J Yin arXiv preprint arXiv:1911.11363, 2019 | 15 | 2019 |
Convergence of distributed stochastic variance reduced methods without sampling extra data S Cen, H Zhang, Y Chi, W Chen, TY Liu IEEE Transactions on Signal Processing 68, 3976-3989, 2020 | 14 | 2020 |
Training over-parameterized deep resnet is almost as easy as training a two-layer network H Zhang, D Yu, W Chen, TY Liu | 14 | 2019 |
Analysis of robust PCA via local incoherence H Zhang, Y Zhou, Y Liang Advances in Neural Information Processing Systems 28, 2015 | 14 | 2015 |