Folgen
Mohammad Shoeybi
Mohammad Shoeybi
Director of Applied Research at NVIDIA
Bestätigte E-Mail-Adresse bei nvidia.com
Titel
Zitiert von
Zitiert von
Jahr
Megatron-lm: Training multi-billion parameter language models using model parallelism
M Shoeybi, M Patwary, R Puri, P LeGresley, J Casper, B Catanzaro
arXiv preprint arXiv:1909.08053, 2019
17972019
Bloom: A 176b-parameter open-access multilingual language model
T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ...
16222023
Deep voice: Real-time neural text-to-speech
SÖ Arık, M Chrzanowski, A Coates, G Diamos, A Gibiansky, Y Kang, X Li, ...
International conference on machine learning, 195-204, 2017
8452017
Efficient large-scale language model training on gpu clusters using megatron-lm
D Narayanan, M Shoeybi, J Casper, P LeGresley, M Patwary, ...
Proceedings of the International Conference for High Performance Computing …, 2021
6562021
Using deepspeed and megatron to train megatron-turing nlg 530b, a large-scale generative language model
S Smith, M Patwary, B Norick, P LeGresley, S Rajbhandari, J Casper, ...
arXiv preprint arXiv:2201.11990, 2022
6462022
Reducing activation recomputation in large transformer models
VA Korthikanti, J Casper, S Lym, L McAfee, M Andersch, M Shoeybi, ...
Proceedings of Machine Learning and Systems 5, 341-353, 2023
2002023
Vila: On pre-training for visual language models
J Lin, H Yin, W Ping, P Molchanov, M Shoeybi, S Han
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
1802024
Training question answering models from synthetic data
R Puri, R Spring, M Patwary, M Shoeybi, B Catanzaro
arXiv preprint arXiv:2002.09599, 2020
1692020
Factuality enhanced language models for open-ended text generation
N Lee, W Ping, P Xu, M Patwary, PN Fung, M Shoeybi, B Catanzaro
Advances in Neural Information Processing Systems 35, 34586-34599, 2022
1642022
On the use of the Ffowcs Williams-Hawkings equation to predict far-field jet noise from large-eddy simulations
S Mendez, M Shoeybi, SK Lele, P Moin
International Journal of Aeroacoustics 12 (1-2), 1-20, 2013
1632013
MEGATRON-CNTRL: Controllable story generation with external knowledge using large-scale language models
P Xu, M Patwary, M Shoeybi, R Puri, P Fung, A Anandkumar, B Catanzaro
arXiv preprint arXiv:2010.00840, 2020
1572020
BioMegatron: larger biomedical domain language model
HC Shin, Y Zhang, E Bakhturina, R Puri, M Patwary, M Shoeybi, R Mani
Proceedings of the 2020 Conference on Empirical Methods in Natural Language …, 2020
1442020
Long-short transformer: Efficient transformers for language and vision
C Zhu, W Ping, C Xiao, M Shoeybi, T Goldstein, A Anandkumar, ...
Advances in neural information processing systems 34, 17723-17736, 2021
1372021
Fp8 formats for deep learning
P Micikevicius, D Stosic, N Burgess, M Cornea, P Dubey, R Grisenthwaite, ...
arXiv preprint arXiv:2209.05433, 2022
1242022
Retrieval meets long context large language models
P Xu, W Ping, X Wu, L McAfee, C Zhu, Z Liu, S Subramanian, ...
arXiv preprint arXiv:2310.03025, 2023
1102023
Stable and accurate schemes for the compressible Navier–Stokes equations
K Mattsson, M Svärd, M Shoeybi
Journal of Computational Physics 227 (4), 2293-2316, 2008
1002008
End-to-end training of neural retrievers for open-domain question answering
DS Sachan, M Patwary, M Shoeybi, N Kant, W Ping, WL Hamilton, ...
arXiv preprint arXiv:2101.00408, 2021
982021
Unsupervised video interpolation using cycle consistency
FA Reda, D Sun, A Dundar, M Shoeybi, G Liu, KJ Shih, A Tao, J Kautz, ...
Proceedings of the IEEE/CVF international conference on computer Vision, 892-900, 2019
982019
Exploring the limits of domain-adaptive training for detoxifying large-scale language models
B Wang, W Ping, C Xiao, P Xu, M Patwary, M Shoeybi, B Li, ...
Advances in Neural Information Processing Systems 35, 35811-35824, 2022
612022
Shall we pretrain autoregressive language models with retrieval? a comprehensive study
B Wang, W Ping, P Xu, L McAfee, Z Liu, M Shoeybi, Y Dong, O Kuchaiev, ...
arXiv preprint arXiv:2304.06762, 2023
572023
Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.
Artikel 1–20