Peng Gao

Cited by

	All	Since 2019
Citations	6759	6754
h-index	36	36
i10-index	67	67

3200

1600

800

2400

20192020202120222023202437 138 292 712 2397 3166

Public access

View all

34 articles

2 articles

available

not available

Based on funding mandates

Co-authors

Hongsheng Li (李鸿升)The Chinese University of Hong KongVerified email at ee.cuhk.edu.hk
Yu QiaoProfessor of Shanghai AI Laboratory; Shenzhen Institutes of Advanced Technology, CASVerified email at siat.ac.cn
Renrui ZhangMMLab CUHK & Peking UniversityVerified email at pku.edu.cn
Shijie GengResearch Scientist, ByteDance Inc.Verified email at bytedance.com
Jifeng DaiAssociate Professor of EE, Tsinghua University; Adjuct Researcher of Shanghai AI LaboratoryVerified email at tsinghua.edu.cn
Ziyi LinThe Chinese University of Hong KongVerified email at link.cuhk.edu.hk
Jiaming HanPhD Student, CUHK MMLabVerified email at link.cuhk.edu.hk
Wenqi ShaoResearcher at Shanghai AI LaboratoryVerified email at pjlab.org.cn
Xiaogang WangProfessor of Electronic Engineering, the Chinese University of Hong KongVerified email at ee.cuhk.edu.hk
Steven C.H. HoiManaging Director of Salesforce Research Asia; IEEE Fellow; Professor at SMUVerified email at smu.edu.sg
Jiasen LuResearch Scientist, AppleVerified email at apple.com

Peng Gao

Shanghai AI Lab

Verified email at pjlab.org.cn - Homepage

Image/Video Generation LLMs VLMs


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Clip-adapter: Better vision-language models with feature adapters P Gao, S Geng, R Zhang, T Ma, R Fang, Y Zhang, H Li, Y Qiao International Journal of Computer Vision, 2021	624	2021
Llama-adapter: Efficient fine-tuning of language models with zero-init attention R Zhang, J Han, C Liu, P Gao, A Zhou, X Hu, S Yan, P Lu, H Li, Y Qiao arXiv preprint arXiv:2303.16199, 2023	497	2023
Uniformer: Unified transformer for efficient spatiotemporal representation learning K Li, Y Wang, P Gao, G Song, Y Liu, H Li, Y Qiao arXiv preprint arXiv:2201.04676, 2022	493*	2022
Tip-adapter: Training-free clip-adapter for better vision-language modeling R Zhang, R Fang, W Zhang, P Gao, K Li, J Dai, Y Qiao, H Li arXiv preprint arXiv:2111.03930, 2021	483*	2021
Dynamic fusion with intra-and inter-modality attention flow for visual question answering P Gao, Z Jiang, H You, P Lu, SCH Hoi, X Wang, H Li Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2019	418*	2019
Llama-adapter v2: Parameter-efficient visual instruction model P Gao, J Han, R Zhang, Z Lin, S Geng, A Zhou, W Zhang, P Lu, C He, ... arXiv preprint arXiv:2304.15010, 2023	366	2023
Pointclip: Point cloud understanding by clip R Zhang, Z Guo, W Zhang, K Li, X Miao, B Cui, Y Qiao, P Gao, H Li Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022	329	2022
Fast convergence of detr with spatially modulated co-attention P Gao, M Zheng, X Wang, J Dai, H Li Proceedings of the IEEE/CVF international conference on computer vision …, 2021	298	2021
End-to-end object detection with adaptive clustering transformer M Zheng, P Gao, R Zhang, K Li, X Wang, H Li, H Dong arXiv preprint arXiv:2011.09315, 2020	214	2020
Point-m2ae: multi-scale masked autoencoders for hierarchical point cloud pre-training R Zhang, Z Guo, P Gao, R Fang, B Zhao, D Wang, Y Qiao, H Li Advances in neural information processing systems 35, 27061-27074, 2022	182	2022
Frozen clip models are efficient video learners Z Lin, S Geng, R Zhang, P Gao, G De Melo, X Wang, J Dai, Y Qiao, H Li European Conference on Computer Vision, 388-404, 2022	149	2022
Convmae: Masked convolution meets masked autoencoders P Gao, T Ma, H Li, Z Lin, J Dai, Y Qiao arXiv preprint arXiv:2205.03892, 2022	145*	2022
Pointclip v2: Prompting clip and gpt for powerful 3d open-world learning X Zhu, R Zhang, B He, Z Guo, Z Zeng, Z Qin, S Zhang, P Gao Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023	135*	2023
Prompt, generate, then cache: Cascade of foundation models makes strong few-shot learners R Zhang, X Hu, B Li, S Huang, H Deng, Y Qiao, P Gao, H Li Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023	113	2023
MonoDETR: Depth-guided transformer for monocular 3D object detection R Zhang, H Qiu, T Wang, Z Guo, Z Cui, Y Qiao, H Li, P Gao Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023	108	2023
Lvlm-ehub: A comprehensive evaluation benchmark for large vision-language models P Xu, W Shao, K Zhang, P Gao, S Liu, M Lei, F Meng, S Huang, Y Qiao, ... arXiv preprint arXiv:2306.09265, 2023	107	2023
Personalize segment anything model with one shot R Zhang, Z Jiang, Z Guo, S Yan, J Pan, X Ma, H Dong, P Gao, H Li arXiv preprint arXiv:2305.03048, 2023	107	2023
Sphinx: The joint mixing of weights, tasks, and visual embeddings for multi-modal large language models Z Lin, C Liu, R Zhang, P Gao, L Qiu, H Xiao, H Qiu, C Lin, W Shao, ... arXiv preprint arXiv:2311.07575, 2023	103	2023
Learning 3d representations from 2d pre-trained models via image-to-point masked autoencoders R Zhang, L Wang, Y Qiao, P Gao, H Li Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023	97	2023
Multi-modality latent interaction network for visual question answering P Gao, H You, Z Zhang, X Wang, H Li Proceedings of the IEEE/CVF international conference on computer vision …, 2019	94*	2019

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors