Follow
Cosmin Paduraru
Cosmin Paduraru
DeepMind
Verified email at google.com
Title
Cited by
Cited by
Year
Challenges of real-world reinforcement learning: definitions, benchmarks and analysis
G Dulac-Arnold, N Levine, DJ Mankowitz, J Li, C Paduraru, S Gowal, ...
Machine Learning 110 (9), 2419-2468, 2021
519*2021
Safe exploration in continuous action spaces
G Dalal, K Dvijotham, M Vecerik, T Hester, C Paduraru, Y Tassa
arXiv preprint arXiv:1801.08757, 2018
2482018
Rl unplugged: A suite of benchmarks for offline reinforcement learning
C Gulcehre, Z Wang, A Novikov, T Paine, S Gómez, K Zolna, R Agarwal, ...
Advances in Neural Information Processing Systems 33, 7248-7259, 2020
76*2020
Hyperparameter selection for offline reinforcement learning
TL Paine, C Paduraru, A Michi, C Gulcehre, K Zolna, A Novikov, Z Wang, ...
arXiv preprint arXiv:2007.09055, 2020
722020
Benchmarks for deep off-policy evaluation
J Fu, M Norouzi, O Nachum, G Tucker, Z Wang, A Novikov, M Yang, ...
arXiv preprint arXiv:2103.16596, 2021
292021
Off-policy evaluation in Markov decision processes
C Paduraru
272013
Off-policy learning with options and recognizers
D Precup, C Paduraru, A Koop, RS Sutton, S Singh
Advances in Neural Information Processing Systems 18, 2005
262005
Responding to new information in a mining complex: Fast mechanisms using machine learning
C Paduraru, R Dimitrakopoulos
Mining Technology, 2019
142019
Grounding Abstractions in Predictive State Representations.
B Tanner, V Bulitko, A Koop, C Paduraru
IJCAI, 1077-1082, 2007
142007
Development and validation of a supervised machine learning radar Doppler spectra peak-finding algorithm
H Kalesse, T Vogl, C Paduraru, E Luke
Atmospheric Measurement Techniques 12 (8), 4591-4617, 2019
122019
Adaptive policies for short-term material flow optimization in a mining complex
C Paduraru, R Dimitrakopoulos
Mining Technology 127 (1), 56-63, 2018
112018
Autoregressive dynamics models for offline policy evaluation and optimization
MR Zhang, TL Paine, O Nachum, C Paduraru, G Tucker, Z Wang, ...
arXiv preprint arXiv:2104.13877, 2021
92021
Active offline policy selection
K Konyushova, Y Chen, T Paine, C Gulcehre, C Paduraru, DJ Mankowitz, ...
Advances in Neural Information Processing Systems 34, 24631-24644, 2021
72021
Planning with approximate and learned models of markov decision processes
C Paduraru
These de maıtre, University of Alberta, 2007
72007
Challenges of Real-World Reinforcement Learning: Definitions, Benchmarks & Analysis
C Paduraru, DJ Mankowitz, G Dulac-Arnold, J Li, N Levine, S Gowal, ...
62021
ziyu wang, and Norouzi, M. Autoregressive dynamics models for offline policy evaluation and optimization
MR Zhang, T Paine, O Nachum, C Paduraru, G Tucker
International Conference on Learning Representations, 2021
62021
Model-based reinforcement learning with state aggregation
C Paduraru, R Kaplow, D Precup, J Pineau
8th European Workshop on Reinforcement Learning, 2008
62008
A study of off-policy learning in computational sustainability
C Paduraru, D Precup, J Pineau, G Comanici
European Workshop on Reinforcement Learning (EWRL) 24, 89-102, 2012
42012
Robust constrained reinforcement learning for continuous control with model misspecification
DJ Mankowitz, DA Calian, R Jeong, C Paduraru, N Heess, S Dathathri, ...
arXiv preprint arXiv:2010.10644, 2020
32020
Temporal abstraction
D Precup, C Paduraru, A Koop, RS Sutton, S Singh
URL: http://videolectures. net/site/normal_dl/tag 1199094, 2018
32018
The system can't perform the operation now. Try again later.
Articles 1–20