Inducing and exploiting activation sparsity for fast inference on deep neural networks M Kurtz, J Kopinsky, R Gelashvili, A Matveev, J Carr, M Goin, W Leiserson, ... International Conference on Machine Learning, 5533-5543, 2020 | 171 | 2020 |
The optimal bert surgeon: Scalable and accurate second-order pruning for large language models E Kurtic, D Campos, T Nguyen, E Frantar, M Kurtz, B Fineran, M Goin, ... arXiv preprint arXiv:2203.07259, 2022 | 114 | 2022 |
How well do sparse imagenet models transfer? E Iofinova, A Peste, M Kurtz, D Alistarh Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 45 | 2022 |
Sparse* bert: Sparse models are robust D Campos, A Marques, T Nguyen, M Kurtz, C Zhai arXiv preprint arXiv:2205.12452, 2022 | 7 | 2022 |
Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment A Agarwalla, A Gupta, A Marques, S Pandit, M Goin, E Kurtic, K Leong, ... arXiv preprint arXiv:2405.03594, 2024 | 5 | 2024 |
System and method of accelerating execution of a neural network A Matveev, D Alistarh, J Kopinsky, R Gelashvili, M Kurtz, N Shavit US Patent 11,195,095, 2021 | 4 | 2021 |
oBERTa: Improving Sparse Transfer Learning via improved initialization, distillation, and pruning regimes D Campos, A Marques, M Kurtz, CX Zhai arXiv preprint arXiv:2303.17612, 2023 | 3 | 2023 |
Sparse* BERT: sparse models generalize to new tasks and domains D Campos, A Marques, T Nguyen, M Kurtz, CX Zhai arXiv preprint arXiv:2205.12452, 2022 | 2 | 2022 |
System and method of training a neural network M Kurtz, D Alistarh US Patent App. 17/149,043, 2021 | 1 | 2021 |
" Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization E Kurtic, A Marques, S Pandit, M Kurtz, D Alistarh arXiv preprint arXiv:2411.02355, 2024 | | 2024 |
System and method of accelerating execution of a neural network A Matveev, D Alistarh, J Kopinsky, R Gelashvili, M Kurtz, N Shavit US Patent 11,797,855, 2023 | | 2023 |