Follow
Punit Singh Koura
Punit Singh Koura
Meta AI
Verified email at fb.com
Title
Cited by
Cited by
Year
Llama 2: Open foundation and fine-tuned chat models
H Touvron, L Martin, K Stone, P Albert, A Almahairi, Y Babaei, ...
arXiv preprint arXiv:2307.09288, 2023
124242023
Opt: Open pre-trained transformer language models
S Zhang, S Roller, N Goyal, M Artetxe, M Chen, S Chen, C Dewan, ...
arXiv preprint arXiv:2205.01068, 2022
3677*2022
The llama 3 herd of models
A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, A Letman, A Mathur, ...
arXiv preprint arXiv:2407.21783, 2024
22812024
Efficient large scale language modeling with mixtures of experts
M Artetxe, S Bhosale, N Goyal, T Mihaylov, M Ott, S Shleifer, XV Lin, J Du, ...
arXiv preprint arXiv:2112.10684, 2021
149*2021
Few-shot Learning with Multilingual Generative Language Models
XV Lin, T Mihaylov, M Artetxe, T Wang, S Chen, D Simig, M Ott, N Goyal, ...
Proceedings of the 2022 Conference on Empirical Methods in Natural Language …, 2022
131*2022
Opt-iml: Scaling language model instruction meta learning through the lens of generalization
S Iyer, XV Lin, R Pasunuru, T Mihaylov, D Simig, P Yu, K Shuster, T Wang, ...
arXiv preprint arXiv:2212.12017, 2022
1022022
A theory on adam instability in large-scale machine learning
I Molybog, P Albert, M Chen, Z DeVito, D Esiobu, N Goyal, PS Koura, ...
arXiv preprint arXiv:2304.09871, 2023
262023
Llama 2: Open foundation and fine-tuned chat models. arXiv [Preprint](2023)
H Touvron, L Martin, K Stone, P Albert, A Almahairi, Y Babaei, ...
URL https://arxiv. org/abs/2307 9288, 12, 0
14
BTS: Harmonizing Specialized Experts into a Generalist LLM
Q Zhang, P Bhargava, C Bi, CX Cai, J Foerster, J Fu, PS Koura, R Silva, ...
arXiv preprint arXiv:2502.00075, 2025
2025
Optimizing Pretraining Data Mixtures with LLM-Estimated Utility
W Held, B Paranjape, PS Koura, M Lewis, F Zhang, T Mihaylov
arXiv preprint arXiv:2501.11747, 2025
2025
The system can't perform the operation now. Try again later.
Articles 1–10