Follow
Suchin Gururangan
Suchin Gururangan
Verified email at cs.washington.edu - Homepage
Title
Cited by
Cited by
Year
The llama 3 herd of models
A Grattafiori, A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, ...
arXiv preprint arXiv:2407.21783, 2024
36412024
Don't stop pretraining: Adapt language models to domains and tasks
S Gururangan, A Marasović, S Swayamdipta, K Lo, I Beltagy, D Downey, ...
arXiv preprint arXiv:2004.10964, 2020
25772020
Annotation artifacts in natural language inference data
S Gururangan, S Swayamdipta, O Levy, R Schwartz, SR Bowman, ...
arXiv preprint arXiv:1803.02324, 2018
13002018
Realtoxicityprompts: Evaluating neural toxic degeneration in language models
S Gehman, S Gururangan, M Sap, Y Choi, NA Smith
arXiv preprint arXiv:2009.11462, 2020
11992020
Editing models with task arithmetic
G Ilharco, MT Ribeiro, M Wortsman, S Gururangan, L Schmidt, ...
arXiv preprint arXiv:2212.04089, 2022
5452022
All that's' human'is not gold: Evaluating human evaluation of generated text
E Clark, T August, S Serrano, N Haduong, S Gururangan, NA Smith
arXiv preprint arXiv:2107.00061, 2021
4332021
Show your work: Improved reporting of experimental results
J Dodge, S Gururangan, D Card, R Schwartz, NA Smith
arXiv preprint arXiv:1909.03004, 2019
2872019
Less: Selecting influential data for targeted instruction tuning
M Xia, S Malladi, S Gururangan, S Arora, D Chen
arXiv preprint arXiv:2402.04333, 2024
1702024
Branch-train-merge: Embarrassingly parallel training of expert language models
M Li, S Gururangan, T Dettmers, M Lewis, T Althoff, NA Smith, ...
arXiv preprint arXiv:2208.03306, 2022
1622022
Variational pretraining for semi-supervised text classification
S Gururangan, T Dang, D Card, NA Smith
arXiv preprint arXiv:1906.02242, 2019
1452019
Detoxifying language models risks marginalizing minority voices
A Xu, E Pathak, E Wallace, S Gururangan, M Sap, D Klein
arXiv preprint arXiv:2104.06390, 2021
1372021
Demix layers: Disentangling domains for modular language modeling
S Gururangan, M Lewis, A Holtzman, NA Smith, L Zettlemoyer
arXiv preprint arXiv:2108.05036, 2021
1232021
Osworld: Benchmarking multimodal agents for open-ended tasks in real computer environments
T Xie, D Zhang, J Chen, X Li, S Zhao, R Cao, JH Toh, Z Cheng, D Shin, ...
Advances in Neural Information Processing Systems 37, 52040-52094, 2024
932024
Time waits for no one! analysis and challenges of temporal misalignment
K Luu, D Khashabi, S Gururangan, K Mandyam, NA Smith
arXiv preprint arXiv:2111.07408, 2021
862021
Silo language models: Isolating legal risk in a nonparametric datastore
S Min, S Gururangan, E Wallace, W Shi, H Hajishirzi, NA Smith, ...
arXiv preprint arXiv:2308.04430, 2023
682023
Datacomp-lm: In search of the next generation of training sets for language models
J Li, A Fang, G Smyrnis, M Ivgi, M Jordan, SY Gadre, H Bansal, E Guha, ...
Advances in Neural Information Processing Systems 37, 14200-14282, 2024
582024
Nearest neighbor zero-shot inference
W Shi, J Michael, S Gururangan, L Zettlemoyer
Proceedings of the 2022 Conference on Empirical Methods in Natural Language …, 2022
532022
Scaling expert language models with unsupervised domain discovery
S Gururangan, M Li, M Lewis, W Shi, T Althoff, NA Smith, L Zettlemoyer
arXiv preprint arXiv:2303.14177, 2023
432023
Whose language counts as high quality? Measuring language ideologies in text data selection
S Gururangan, D Card, SK Dreier, EK Gade, LZ Wang, Z Wang, ...
arXiv preprint arXiv:2201.10474, 2022
292022
Language models scale reliably with over-training and on downstream tasks
SY Gadre, G Smyrnis, V Shankar, S Gururangan, M Wortsman, R Shao, ...
arXiv preprint arXiv:2403.08540, 2024
282024
The system can't perform the operation now. Try again later.
Articles 1–20