Linjie (Lindsey) Li

Cited by

	All	Since 2019
Citations	7381	7364
h-index	32	32
i10-index	40	39

3300

1650

825

2475

20192020202120222023202424 229 811 1537 3294 1458

Public access

View all

6 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Zhe GanResearch Scientist, AppleVerified email at apple.com
Lijuan WangMicrosoft GenAIVerified email at microsoft.com
Zicheng LiuMicrosoftVerified email at microsoft.com
Kevin LinMicrosoftVerified email at microsoft.com
Zhengyuan YangResearcher, MicrosoftVerified email at microsoft.com
Jianfeng WangMicrosoftVerified email at microsoft.com
Yu ChengVisiting Professor at Rice UniversityVerified email at rice.edu
Yen-Chun ChenResearcher, MicrosoftVerified email at microsoft.com
Licheng Yu 虞立成Research Scientist and Manager, Facebook AIVerified email at fb.com
Jianfeng GaoMicrosoft Research, RedmondVerified email at microsoft.com
Jie Lei 雷杰Research Scientist, Meta AIVerified email at fb.com
Jianwei YangPrincipal Researcher, Microsoft Research, RedmondVerified email at microsoft.com
Chunyuan LiMicrosoft Research, RedmondVerified email at microsoft.com
Siqi SunAssociate Professor; Fudan University, Shanghai AI LabVerified email at fudan.edu.cn

Linjie (Lindsey) Li

Senior Researcher, Microsoft

Verified email at microsoft.com

Vision+Language


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
UNITER: Learning UNiversal Image-TExt Representations YC Chen, L Li, L Yu, AE Kholy, F Ahmed, Z Gan, Y Cheng, J Liu ECCV 2020, 2020	2232*	2020
Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling J Lei, L Li, L Zhou, Z Gan, TL Berg, M Bansal, J Liu CVPR 2021, 2021	581	2021
HERO: Hierarchical Encoder for Video+ Language Omni-representation Pre-training L Li, YC Chen, Y Cheng, Z Gan, L Yu, J Liu EMNLP 2020, 2020	459	2020
Large-Scale Adversarial Training for Vision-and-Language Representation Learning Z Gan, YC Chen, L Li, C Zhu, Y Cheng, J Liu NeurIPS 2020, 2020	458	2020
Relation-aware graph attention network for visual question answering L Li, Z Gan, Y Cheng, J Liu ICCV 2019, 2019	383	2019
GIT: A Generative Image-to-text Transformer for Vision and Language J Wang, Z Yang, X Hu, L Li, K Lin, Z Gan, Z Liu, C Liu, L Wang TMLR, 2022	343	2022
The dawn of lmms: Preliminary explorations with gpt-4v (ision) Z Yang, L Li, K Lin, J Wang, CC Lin, Z Liu, L Wang arXiv preprint arXiv:2309.17421 9, 1, 2023	222	2023
Segment Everything Everywhere All at Once X Zou, J Yang, H Zhang, F Li, L Li, J Gao, YJ Lee NeurIPS 2023, 2023	222	2023
MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action Z Yang, L Li, J Wang, K Lin, E Azarnasab, F Ahmed, Z Liu, C Liu, M Zeng, ... arXiv preprint arXiv:2303.11381, 2023	193	2023
VIOLET: End-to-End Video-Language Transformers with Masked Visual-token Modeling TJ Fu, L Li, Z Gan, K Lin, WY Wang, L Wang, Z Liu arXiv preprint arXiv:2111.12681, 2021	178	2021
SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning K Lin, L Li, CC Lin, F Ahmed, Z Gan, Z Liu, Y Lu, L Wang CVPR 2022, 2021	175	2021
Improving image generation with better captions J Betker, G Goh, L Jing, T Brooks, J Wang, L Li, L Ouyang, J Zhuang, ... Computer Science. https://cdn. openai. com/papers/dall-e-3. pdf 2 (3), 8, 2023	163	2023
Graph Optimal Transport for Cross-Domain Alignment L Chen, Z Gan, Y Cheng, L Li, L Carin, J Liu ICML 2020, 2020	151	2020
Generalized Decoding for Pixel, Image, and Language X Zou, ZY Dou, J Yang, Z Gan, L Li, C Li, X Dai, H Behl, J Wang, L Yuan, ... CVPR 2023, 2022	142	2022
Mitigating hallucination in large multi-modal models via robust instruction tuning F Liu, K Lin, L Li, J Wang, Y Yacoob, L Wang ICLR 2024, 2023	128*	2023
Vision-Language Pre-training: Basics, Recent Advances, and Future Trends Z Gan, L Li, C Li, L Wang, Z Liu, J Gao Foundations and Trends® in Computer Graphics and Vision 14 (3–4), 163-352, 2022	122	2022
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities W Yu, Z Yang, L Li, J Wang, K Lin, Z Liu, X Wang, L Wang arXiv preprint arXiv:2308.02490, 2023	117	2023
Multi-step reasoning via recurrent dual attention for visual dialog Z Gan, Y Cheng, AEI Kholy, L Li, J Liu, J Gao ACL 2019, 2019	110	2019
VALUE: A Multi-Task Benchmark for Video-and-Language Understanding Evaluation L Li, J Lei, Z Gan, L Yu, YC Chen, R Pillai, Y Cheng, L Zhou, XE Wang, ... NeurIPS 2021 Data and Benchmark Track, 2021	95	2021
Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone ZY Dou, A Kamath, Z Gan, P Zhang, J Wang, L Li, Z Liu, C Liu, Y LeCun, ... NeurIPS 2022, 2022	86	2022

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors