Fast and scalable bayesian deep learning by weight-perturbation in adam M Khan, D Nielsen, V Tangkaratt, W Lin, Y Gal, A Srivastava International conference on machine learning, 2611-2620, 2018 | 317 | 2018 |
Conjugate-computation variational inference: Converting variational inference in non-conjugate models to inferences in conjugate models M Khan, W Lin Artificial Intelligence and Statistics, 878-887, 2017 | 159 | 2017 |
Fast and simple natural-gradient variational inference with mixture of exponential-family approximations W Lin, ME Khan, M Schmidt International Conference on Machine Learning, 3992-4002, 2019 | 64 | 2019 |
Variational message passing with structured inference networks W Lin, N Hubacher, ME Khan arXiv preprint arXiv:1803.05589, 2018 | 49 | 2018 |
Faster stochastic variational inference using proximal-gradient methods with general divergence functions ME Khan, R Babanezhad, W Lin, M Schmidt, M Sugiyama arXiv preprint arXiv:1511.00146, 2015 | 48 | 2015 |
Tractable structured natural-gradient descent using local parameterizations W Lin, F Nielsen, KM Emtiyaz, M Schmidt International Conference on Machine Learning, 6680-6691, 2021 | 35 | 2021 |
Handling the positive-definite constraint in the Bayesian learning rule W Lin, M Schmidt, ME Khan International conference on machine learning, 6116-6126, 2020 | 32 | 2020 |
Stein's lemma for the reparameterization trick with exponential family mixtures W Lin, ME Khan, M Schmidt arXiv preprint arXiv:1910.13398, 2019 | 25 | 2019 |
Variational adaptive-Newton method for explorative learning ME Khan, W Lin, V Tangkaratt, Z Liu, D Nielsen arXiv preprint arXiv:1711.05560, 2017 | 22 | 2017 |
Structured second-order methods via natural gradient descent W Lin, F Nielsen, ME Khan, M Schmidt arXiv preprint arXiv:2107.10884, 2021 | 9 | 2021 |
Convergence of proximal-gradient stochastic variational inference under non-decreasing step-size sequence ME Khan, R Babanezhad, W Lin, M Schmidt, M Sugiyama J. Comp. Neurol 319, 359-386, 2015 | 9 | 2015 |
Training Data Attribution via Approximate Unrolled Differentation J Bae, W Lin, J Lorraine, R Grosse arXiv preprint arXiv:2405.12186, 2024 | 8 | 2024 |
Structured inverse-free natural gradient: Memory-efficient & numerically-stable kfac for large neural nets W Lin, F Dangel, R Eschenhagen, K Neklyudov, A Kristiadi, RE Turner, ... arXiv preprint arXiv:2312.05705, 2023 | 7* | 2023 |
Simplifying Momentum-based Positive-definite Submanifold Optimization with Applications to Deep Learning W Lin, V Duruisseaux, M Leok, F Nielsen, ME Khan, M Schmidt arXiv preprint arXiv:2302.09738, 2023 | 7* | 2023 |
WaterlooClarke: TREC 2015 Total Recall Track. H Zhang, W Lin, Y Wang, CLA Clarke, MD Smucker TREC, 2015 | 7 | 2015 |
Can we remove the square-root in adaptive gradient methods? a second-order perspective W Lin, F Dangel, R Eschenhagen, J Bae, RE Turner, A Makhzani arXiv preprint arXiv:2402.03496, 2024 | 3 | 2024 |
Natural-gradient stochastic variational inference for non-conjugate structured variational autoencoder W Lin, ME Khan, N Hubacher, D Nielsen International Conference on Machine Learning, 2017 | 2 | 2017 |
Computationally efficient geometric methods for optimization and inference in machine learning W Lin University of British Columbia, 2023 | 1 | 2023 |
Introduction to Natural-gradient Descent: Part I-VI W Lin, F Nielsen, ME Khan, M Schmidt https://yorkerlin.github.io/posts/2021/09/Geomopt01/, 2021 | | 2021 |
Variational Inference on Deep Exponential Family by using Variational Inferences on Conjugate Models ME Khan, W Lin | | |