Follow
Yi Wu
Yi Wu
Institute for Interdisciplinary Information Sciences, Tsinghua University
Verified email at mail.tsinghua.edu.cn - Homepage
Title
Cited by
Cited by
Year
Multi-agent actor-critic for mixed cooperative-competitive environments
R Lowe, YI Wu, A Tamar, J Harb, OAI Pieter Abbeel, I Mordatch
Advances in neural information processing systems 30, 2017
55852017
The surprising effectiveness of ppo in cooperative multi-agent games
C Yu, A Velu, E Vinitsky, J Gao, Y Wang, A Bayen, Y Wu
Advances in Neural Information Processing Systems 35, 24611-24624, 2022
14212022
Emergent tool use from multi-agent autocurricula
B Baker, I Kanitscheider, T Markov, Y Wu, G Powell, B McGrew, ...
arXiv preprint arXiv:1909.07528, 2019
9192019
Value iteration networks
A Tamar, Y Wu, G Thomas, S Levine, P Abbeel
Advances in neural information processing systems 29, 2016
7852016
Building generalizable agents with a realistic and rich 3d environment
Y Wu, Y Wu, G Gkioxari, Y Tian
arXiv preprint arXiv:1801.02209, 2018
3992018
Robust multi-agent reinforcement learning via minimax deep deterministic policy gradient
S Li, Y Wu, X Cui, H Dong, F Fang, S Russell
Proceedings of the AAAI conference on artificial intelligence 33 (01), 4213-4220, 2019
3742019
Adversarial training for relation extraction
Y Wu, D Bamman, S Russell
Proceedings of the 2017 Conference on Empirical Methods in Natural Language …, 2017
2572017
Multi-task reinforcement learning with soft modularization
R Yang, H Xu, Y Wu, X Wang
Advances in Neural Information Processing Systems 33, 4767-4777, 2020
2042020
Influence-based multi-agent exploration
T Wang, J Wang, Y Wu, C Zhang
arXiv preprint arXiv:1910.05512, 2019
1672019
Bayesian relational memory for semantic visual navigation
Y Wu, Y Wu, A Tamar, S Russell, G Gkioxari, Y Tian
Proceedings of the IEEE/CVF international conference on computer vision …, 2019
136*2019
Evolutionary population curriculum for scaling multi-agent reinforcement learning
Q Long, Z Zhou, A Gupta, F Fang, Y Wu, X Wang
arXiv preprint arXiv:2003.10423, 2020
1262020
Noveld: A simple yet effective exploration criterion
T Zhang, H Xu, X Wang, Y Wu, K Keutzer, JE Gonzalez, Y Tian
Advances in Neural Information Processing Systems 34, 25217-25230, 2021
123*2021
Sequence level contrastive learning for text summarization
S Xu, X Zhang, Y Wu, F Wei
Proceedings of the AAAI conference on artificial intelligence 36 (10), 11556 …, 2022
1072022
Deep reinforcement learning for green security games with real-time information
Y Wang, ZR Shi, L Yu, Y Wu, R Singh, L Joppa, F Fang
Proceedings of the AAAI Conference on Artificial Intelligence 33 (01), 1401-1408, 2019
982019
Bitnet: Scaling 1-bit transformers for large language models
H Wang, S Ma, L Dong, S Huang, H Wang, L Ma, F Yang, R Wang, Y Wu, ...
arXiv preprint arXiv:2310.11453, 2023
852023
Unsupervised extractive summarization by pre-training hierarchical transformers
S Xu, X Zhang, Y Wu, F Wei, M Zhou
arXiv preprint arXiv:2010.08242, 2020
632020
Maximum entropy population-based training for zero-shot human-ai coordination
R Zhao, J Song, Y Yuan, H Hu, Y Gao, Y Wu, Z Sun, W Yang
Proceedings of the AAAI Conference on Artificial Intelligence 37 (5), 6145-6153, 2023
622023
Discovering diverse multi-agent strategic behavior via reward randomization
Z Tang, C Yu, B Chen, H Xu, X Wang, F Fang, S Du, Y Wang, Y Wu
arXiv preprint arXiv:2103.04564, 2021
582021
Language agents with reinforcement learning for strategic play in the werewolf game
Z Xu, C Yu, F Fang, Y Wang, Y Wu
arXiv preprint arXiv:2310.18940, 2023
522023
Is dpo superior to ppo for llm alignment? a comprehensive study
S Xu, W Fu, J Gao, W Ye, W Liu, Z Mei, G Wang, C Yu, Y Wu
arXiv preprint arXiv:2404.10719, 2024
492024
The system can't perform the operation now. Try again later.
Articles 1–20