Follow
Matteo Papini
Title
Cited by
Cited by
Year
Stochastic variance-reduced policy gradient
M Papini, D Binaghi, G Canonaco, M Pirotta, M Restelli
Proceedings of the 35th International Conference on Machine Learning 80 …, 2018
2072018
Policy optimization via importance sampling
AM Metelli, M Papini, F Faccio, M Restelli
Advances in Neural Information Processing Systems 31, 2018
1132018
Feature selection via mutual information: New theoretical insights
M Beraha, AM Metelli, M Papini, A Tirinzoni, M Restelli
2019 international joint conference on neural networks (IJCNN), 1-9, 2019
1102019
Risk-averse trust region optimization for reward-volatility reduction
L Bisi, L Sabbioni, E Vittori, M Papini, M Restelli
arXiv preprint arXiv:1912.03193, 2019
712019
Importance sampling techniques for policy optimization
AM Metelli, M Papini, N Montali, M Restelli
Journal of Machine Learning Research 21 (141), 1-75, 2020
602020
Adaptive batch size for safe policy gradients
M Papini, M Pirotta, M Restelli
Advances in neural information processing systems 30, 2017
502017
Gradient-aware model-based policy search
P D'Oro, AM Metelli, A Tirinzoni, M Papini, M Restelli
Proceedings of the AAAI Conference on Artificial Intelligence 34 (04), 3801-3808, 2020
482020
Smoothing policies and safe policy gradients
M Papini, M Pirotta, M Restelli
Machine Learning 111 (11), 4081-4137, 2022
432022
Optimistic policy optimization via multiple importance sampling
M Papini, AM Metelli, L Lupo, M Restelli
International Conference on Machine Learning, 4989-4999, 2019
412019
Leveraging good representations in linear contextual bandits
M Papini, A Tirinzoni, M Restelli, A Lazaric, M Pirotta
International Conference on Machine Learning, 8371-8380, 2021
302021
Reinforcement learning in linear mdps: Constant regret and representation selection
M Papini, A Tirinzoni, A Pacchiano, M Restelli, A Lazaric, M Pirotta
Advances in Neural Information Processing Systems 34, 16371-16383, 2021
222021
Balancing learning speed and stability in policy gradient via adaptive exploration
M Papini, A Battistello, M Restelli
International conference on artificial intelligence and statistics, 1188-1199, 2020
182020
Lifting the information ratio: An information-theoretic analysis of thompson sampling for contextual bandits
G Neu, I Olkhovskaia, M Papini, L Schwartz
Advances in Neural Information Processing Systems 35, 9486-9498, 2022
162022
Policy optimization as online learning with mediator feedback
AM Metelli, M Papini, P D'Oro, M Restelli
Proceedings of the AAAI Conference on Artificial Intelligence 35 (10), 8958-8966, 2021
142021
Offline primal-dual reinforcement learning for linear mdps
G Gabbianelli, G Neu, M Papini, NM Okolo
International Conference on Artificial Intelligence and Statistics, 3169-3177, 2024
92024
Online adversarial mdps with off-policy feedback and known transitions
F Bacchiocchi, FE Stradi, M Papini, AM Metelli, N Gatti
Sixteenth European Workshop on Reinforcement Learning, 2023
92023
Importance-weighted offline learning done right
G Gabbianelli, G Neu, M Papini
International Conference on Algorithmic Learning Theory, 614-634, 2024
52024
No-Regret Reinforcement Learning in Smooth MDPs
D Maran, AM Metelli, M Papini, M Restell
arXiv preprint arXiv:2402.03792, 2024
42024
Online learning with off-policy feedback
G Gabbianelli, G Neu, M Papini
International Conference on Algorithmic Learning Theory, 620-641, 2023
42023
Scalable representation learning in linear contextual bandits with constant regret guarantees
A Tirinzoni, M Papini, A Touati, A Lazaric, M Pirotta
Advances in Neural Information Processing Systems 35, 2307-2319, 2022
42022
The system can't perform the operation now. Try again later.
Articles 1–20