Follow
Xudong Han
Xudong Han
LibrAI & MBZUAI
Verified email at mbzuai.ac.ae - Homepage
Title
Cited by
Cited by
Year
Do-not-answer: Evaluating safeguards in LLMs
Y Wang, H Li, X Han, P Nakov, T Baldwin
Findings of the Association for Computational Linguistics: EACL 2024, 896-911, 2024
62*2024
Diverse adversaries for mitigating bias in training
X Han, T Baldwin, T Cohn
arXiv preprint arXiv:2101.10001, 2021
582021
Jais and jais-chat: Arabic-centric foundation and instruction-tuned open generative large language models
N Sengupta, SK Sahu, B Jia, S Katipomu, H Li, F Koto, W Marshall, ...
arXiv preprint arXiv:2308.16149, 2023
552023
Evaluating debiasing techniques for intersectional biases
S Subramanian, X Han, T Baldwin, T Cohn, L Frermann
arXiv preprint arXiv:2109.10441, 2021
472021
Balancing out bias: Achieving fairness through balanced training
X Han, T Baldwin, T Cohn
arXiv preprint arXiv:2109.08253, 2021
47*2021
Contrastive learning for fair representations
A Shen, X Han, T Cohn, T Baldwin, L Frermann
arXiv preprint arXiv:2109.10645, 2021
242021
Decoupling Adversarial Training for Fair NLP
X Han, T Baldwin, T Cohn
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 2021
222021
Fairlib: A unified framework for assessing and improving fairness
X Han, A Shen, Y Li, L Frermann, T Baldwin, T Cohn
Proceedings of the 2022 Conference on Empirical Methods in Natural Language …, 2022
19*2022
Optimising equal opportunity fairness in model training
A Shen, X Han, T Cohn, T Baldwin, L Frermann
arXiv preprint arXiv:2205.02393, 2022
182022
Does Representational Fairness Imply Empirical Fairness?
A Shen, X Han, T Cohn, T Baldwin, L Frermann
Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022 …, 2022
152022
Towards equal opportunity fairness through adversarial learning
X Han, T Baldwin, T Cohn
arXiv preprint arXiv:2203.06317, 2022
132022
Systematic evaluation of predictive fairness
X Han, A Shen, T Cohn, T Baldwin, L Frermann
arXiv preprint arXiv:2210.08758, 2022
102022
Fair enough: Standardizing evaluation and model selection for fairness research in NLP
X Han, T Baldwin, T Cohn
arXiv preprint arXiv:2302.05711, 2023
72023
Grounding learning of modifier dynamics: An application to color naming
X Han, P Schulz, T Cohn
arXiv preprint arXiv:1909.07586, 2019
62019
Against The Achilles' Heel: A Survey on Red Teaming for Generative Models
L Lin, H Mu, Z Zhai, M Wang, Y Wang, R Wang, J Gao, Y Zhang, W Che, ...
arXiv preprint arXiv:2404.00629, 2024
52024
Commodity recommendation for users based on E-commerce data
F Yang, X Han, J Lang, W Lu, L Liu, L Zhang, J Pan
Proceedings of the 2nd International Conference on Big Data Research, 146-149, 2018
52018
Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language Models as Agents
R Wang, H Li, X Han, Y Zhang, T Baldwin
arXiv preprint arXiv:2402.11651, 2024
42024
Everybody needs good neighbours: An unsupervised locality-based method for bias mitigation
X Han, T Baldwin, T Cohn
The Eleventh International Conference on Learning Representations, 2022
32022
A Chinese Dataset for Evaluating the Safeguards in Large Language Models
Y Wang, Z Zhai, H Li, X Han, L Lin, Z Zhang, J Zhao, P Nakov, T Baldwin
arXiv preprint arXiv:2402.12193, 2024
22024
Uncertainty Estimation for Debiased Models: Does Fairness Hurt Reliability?
G Kuzmin, A Vazhentsev, A Shelmanov, X Han, S Suster, M Panov, ...
Proceedings of the 13th International Joint Conference on Natural Language …, 2023
22023
The system can't perform the operation now. Try again later.
Articles 1–20