Ronald Ortner
Ronald Ortner
Bestätigte E-Mail-Adresse bei unileoben.ac.at - Startseite
Titel
Zitiert von
Zitiert von
Jahr
Near-optimal Regret Bounds for Reinforcement Learning.
T Jaksch, R Ortner, P Auer
Journal of Machine Learning Research 11 (4), 2010
7022010
UCB revisited: Improved regret bounds for the stochastic multi-armed bandit problem
P Auer, R Ortner
Periodica Mathematica Hungarica 61 (1-2), 55-65, 2010
2582010
Improved rates for the stochastic continuum-armed bandit problem
P Auer, R Ortner, C Szepesvári
International Conference on Computational Learning Theory, 454-468, 2007
2052007
Logarithmic online regret bounds for undiscounted reinforcement learning
P Ortner, R Auer
Advances in Neural Information Processing Systems 19, 49, 2007
1972007
Near-optimal Regret Bounds for Reinforcement Learning.
P Auer, T Jaksch, R Ortner
NIPS, 89-96, 2008
1412008
A boosting approach to multiple instance learning
P Auer, R Ortner
European Conference on Machine Learning, 63-74, 2004
1002004
Online regret bounds for undiscounted continuous reinforcement learning
R Ortner, D Ryabko
arXiv preprint arXiv:1302.2550, 2013
692013
Efficient bias-span-constrained exploration-exploitation in reinforcement learning
R Fruit, M Pirotta, A Lazaric, R Ortner
International Conference on Machine Learning, 1578-1586, 2018
532018
Regret bounds for restless markov bandits
R Ortner, D Ryabko, P Auer, R Munos
International Conference on Algorithmic Learning Theory, 214-228, 2012
502012
PAC-Bayesian analysis of contextual bandits
Y Seldin, P Auer, JS Shawe-Taylor, R Ortner, F Laviolette
Advances in neural information processing systems, 1683-1691, 2011
472011
Non-backtracking random walks and cogrowth of graphs
R Ortner, W Woess
Canadian Journal of Mathematics 59 (4), 828-844, 2007
422007
Adaptively tracking the best bandit arm with an unknown number of distribution changes
P Auer, P Gajane, R Ortner
Conference on Learning Theory, 138-158, 2019
41*2019
Regret bounds for restless Markov bandits
R Ortner, D Ryabko, P Auer, R Munos
Theoretical Computer Science 558, 62-76, 2014
352014
Pseudometrics for state aggregation in average reward Markov decision processes
R Ortner
International Conference on Algorithmic Learning Theory, 373-387, 2007
312007
Improved learning complexity in combinatorial pure exploration bandits
V Gabillon, A Lazaric, M Ghavamzadeh, R Ortner, P Bartlett
Artificial Intelligence and Statistics, 1004-1012, 2016
302016
Adaptive aggregation for reinforcement learning in average reward Markov decision processes
R Ortner
Annals of Operations Research 208 (1), 321-336, 2013
292013
Variational regret bounds for reinforcement learning
P Gajane, R Ortner, P Auer
arXiv preprint arXiv:1905.05857, 2019
26*2019
Optimal regret bounds for selecting the state representation in reinforcement learning
OA Maillard, P Nguyen, R Ortner, D Ryabko
International Conference on Machine Learning, 543-551, 2013
262013
Pareto front identification from stochastic bandit feedback
P Auer, CK Chiang, R Ortner, M Drugan
Artificial intelligence and statistics, 939-947, 2016
252016
Improved regret bounds for undiscounted continuous reinforcement learning
K Lakshmanan, R Ortner, D Ryabko
International Conference on Machine Learning, 524-532, 2015
232015
Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.
Artikel 1–20