SGM: sequence generation model for multi-label classification P Yang, X Sun, W Li, S Ma, W Wu, H Wang arXiv preprint arXiv:1806.04822, 2018 | 371 | 2018 |
Global encoding for abstractive summarization J Lin, X Sun, S Ma, Q Su arXiv preprint arXiv:1805.03989, 2018 | 178 | 2018 |
meprop: Sparsified back propagation for accelerated deep learning with reduced overfitting X Sun, X Ren, S Ma, H Wang International Conference on Machine Learning, 3299-3308, 2017 | 166 | 2017 |
Language is not all you need: Aligning perception with language models S Huang, L Dong, W Wang, Y Hao, S Singhal, S Ma, T Lv, L Cui, ... arXiv preprint arXiv:2302.14045, 2023 | 158 | 2023 |
Deepnet: Scaling transformers to 1,000 layers H Wang, S Ma, L Dong, S Huang, D Zhang, F Wei arXiv preprint arXiv:2203.00555, 2022 | 91 | 2022 |
Xlm-e: Cross-lingual language model pre-training via electra Z Chi, S Huang, L Dong, S Ma, B Zheng, S Singhal, P Bajaj, X Song, ... arXiv preprint arXiv:2106.16138, 2021 | 87 | 2021 |
Kosmos-2: Grounding Multimodal Large Language Models to the World Z Peng, W Wang, L Dong, Y Hao, S Huang, S Ma, F Wei arXiv preprint arXiv:2306.14824, 2023 | 81 | 2023 |
A simple and effective unified encoder for document-level machine translation S Ma, D Zhang, M Zhou Proceedings of the 58th annual meeting of the association for computational …, 2020 | 79 | 2020 |
Improving semantic relevance for sequence-to-sequence learning of chinese social media text summarization S Ma, X Sun, J Xu, H Wang, W Li, Q Su arXiv preprint arXiv:1706.02459, 2017 | 77 | 2017 |
Bag-of-words as target for neural machine translation S Ma, X Sun, Y Wang, J Lin arXiv preprint arXiv:1805.04871, 2018 | 72 | 2018 |
Query and output: Generating words by querying distributed word representations for paraphrase generation S Ma, X Sun, W Li, S Li, W Li, X Ren arXiv preprint arXiv:1803.01465, 2018 | 69 | 2018 |
Semantic-unit-based dilated convolution for multi-label text classification J Lin, Q Su, P Yang, S Ma, X Sun arXiv preprint arXiv:1808.08561, 2018 | 65 | 2018 |
mT6: Multilingual pretrained text-to-text transformer with translation pairs Z Chi, L Dong, S Ma, SHXL Mao, H Huang, F Wei arXiv preprint arXiv:2104.08692, 2021 | 59 | 2021 |
A hierarchical end-to-end model for jointly improving text summarization and sentiment classification S Ma, X Sun, J Lin, X Ren arXiv preprint arXiv:1805.01089, 2018 | 57 | 2018 |
Alternating language modeling for cross-lingual pre-training J Yang, S Ma, D Zhang, S Wu, Z Li, M Zhou Proceedings of the AAAI Conference on Artificial Intelligence 34 (05), 9386-9393, 2020 | 55 | 2020 |
Deltalm: Encoder-decoder pre-training for language generation and translation by augmenting pretrained multilingual encoders S Ma, L Dong, S Huang, D Zhang, A Muzio, S Singhal, HH Awadalla, ... arXiv preprint arXiv:2106.13736, 2021 | 54 | 2021 |
Language models are general-purpose interfaces Y Hao, H Song, L Dong, S Huang, Z Chi, W Wang, S Ma, F Wei arXiv preprint arXiv:2206.06336, 2022 | 53 | 2022 |
Autoencoder as assistant supervisor: Improving text representation for chinese social media text summarization S Ma, X Sun, J Lin, H Wang arXiv preprint arXiv:1805.04869, 2018 | 53 | 2018 |
Livebot: Generating live video comments based on visual and textual contexts S Ma, L Cui, D Dai, F Wei, X Sun Proceedings of the AAAI Conference on Artificial Intelligence 33 (01), 6810-6817, 2019 | 51 | 2019 |
A deep reinforced sequence-to-set model for multi-label classification P Yang, F Luo, S Ma, J Lin, X Sun Proceedings of the 57th Annual Meeting of the Association for Computational …, 2019 | 51 | 2019 |