Mamba: Linear-time sequence modeling with selective state spaces A Gu, T Dao arXiv preprint arXiv:2312.00752, 2023 | 1871 | 2023 |
Efficiently modeling long sequences with structured state spaces A Gu, K Goel, C Ré arXiv preprint arXiv:2111.00396, 2021 | 1558 | 2021 |
Combining recurrent, convolutional, and continuous-time models with linear state space layers A Gu, I Johnson, K Goel, K Saab, T Dao, A Rudra, C Ré Advances in neural information processing systems 34, 572-585, 2021 | 509 | 2021 |
Representation tradeoffs for hyperbolic embeddings F Sala, C De Sa, A Gu, C Ré International conference on machine learning, 4460-4469, 2018 | 501 | 2018 |
Hippo: Recurrent memory with optimal polynomial projections A Gu, T Dao, S Ermon, A Rudra, C Ré Advances in neural information processing systems 33, 1474-1487, 2020 | 465 | 2020 |
On the parameterization and initialization of diagonal state space models A Gu, K Goel, A Gupta, C Ré Advances in Neural Information Processing Systems 35, 35971-35983, 2022 | 307 | 2022 |
Diagonal state spaces are as effective as structured state spaces A Gupta, A Gu, J Berant Advances in Neural Information Processing Systems 35, 22982-22994, 2022 | 278 | 2022 |
Transformers are SSMs: Generalized models and efficient algorithms through structured state space duality T Dao, A Gu arXiv preprint arXiv:2405.21060, 2024 | 256 | 2024 |
No subclass left behind: Fine-grained robustness in coarse-grained classification problems N Sohoni, J Dunnmon, G Angus, A Gu, C Ré Advances in Neural Information Processing Systems 33, 19339-19352, 2020 | 250 | 2020 |
Learning mixed-curvature representations in product spaces A Gu, F Sala, B Gunel, C Ré International conference on learning representations, 2018 | 249 | 2018 |
Resurrecting recurrent neural networks for long sequences A Orvieto, SL Smith, A Gu, A Fernando, C Gulcehre, R Pascanu, S De International Conference on Machine Learning, 26670-26698, 2023 | 246 | 2023 |
A kernel theory of modern data augmentation T Dao, A Gu, A Ratner, V Smith, C De Sa, C Ré International conference on machine learning, 1528-1537, 2019 | 232 | 2019 |
It’s raw! audio generation with state-space models K Goel, A Gu, C Donahue, C Ré International Conference on Machine Learning, 7616-7633, 2022 | 209 | 2022 |
S4nd: Modeling images and videos as multidimensional signals with state spaces E Nguyen, K Goel, A Gu, G Downs, P Shah, T Dao, S Baccus, C Ré Advances in neural information processing systems 35, 2846-2861, 2022 | 187 | 2022 |
Learning fast algorithms for linear transforms using butterfly factorizations T Dao, A Gu, M Eichhorn, A Rudra, C Ré International conference on machine learning, 1517-1527, 2019 | 126 | 2019 |
From trees to continuous embeddings and back: Hyperbolic hierarchical clustering I Chami, A Gu, V Chatziafratis, C Ré Advances in Neural Information Processing Systems 33, 15065-15076, 2020 | 109 | 2020 |
How to train your hippo: State space models with generalized orthogonal basis projections A Gu, I Johnson, A Timalsina, A Rudra, C Ré arXiv preprint arXiv:2206.12037, 2022 | 95 | 2022 |
Improving the gating mechanism of recurrent neural networks A Gu, C Gulcehre, T Paine, M Hoffman, R Pascanu International conference on machine learning, 3800-3809, 2020 | 93 | 2020 |
Model patching: Closing the subgroup performance gap with data augmentation K Goel, A Gu, Y Li, C Ré arXiv preprint arXiv:2008.06775, 2020 | 88 | 2020 |
Griffin: Mixing gated linear recurrences with local attention for efficient language models S De, SL Smith, A Fernando, A Botev, G Cristian-Muraru, A Gu, R Haroun, ... arXiv preprint arXiv:2402.19427, 2024 | 83 | 2024 |