Følg
Ronald Ortner
Ronald Ortner
Verificeret mail på unileoben.ac.at - Startside
Titel
Citeret af
Citeret af
År
Near-optimal regret bounds for reinforcement learning
P Auer, T Jaksch, R Ortner
Advances in neural information processing systems 21, 2008
14932008
UCB revisited: Improved regret bounds for the stochastic multi-armed bandit problem
P Auer, R Ortner
Periodica Mathematica Hungarica 61 (1-2), 55-65, 2010
3722010
Logarithmic online regret bounds for undiscounted reinforcement learning
P Auer, R Ortner
Advances in neural information processing systems 19, 2006
2952006
Improved rates for the stochastic continuum-armed bandit problem
P Auer, R Ortner, C Szepesvári
International Conference on Computational Learning Theory, 454-468, 2007
2632007
Adaptively tracking the best bandit arm with an unknown number of distribution changes
P Auer, P Gajane, R Ortner
Conference on Learning Theory, 138-158, 2019
150*2019
Efficient bias-span-constrained exploration-exploitation in reinforcement learning
R Fruit, M Pirotta, A Lazaric, R Ortner
International Conference on Machine Learning, 1578-1586, 2018
1112018
A boosting approach to multiple instance learning
P Auer, R Ortner
European conference on machine learning, 63-74, 2004
1072004
Online regret bounds for undiscounted continuous reinforcement learning
R Ortner, D Ryabko
Advances in Neural Information Processing Systems 25, 2012
892012
Regret bounds for restless markov bandits
R Ortner, D Ryabko, P Auer, R Munos
International conference on algorithmic learning theory, 214-228, 2012
862012
Variational regret bounds for reinforcement learning
R Ortner, P Gajane, P Auer
Uncertainty in Artificial Intelligence, 81-90, 2020
682020
PAC-Bayesian analysis of contextual bandits
Y Seldin, P Auer, J Shawe-taylor, R Ortner, F Laviolette
Advances in neural information processing systems 24, 2011
582011
Regret bounds for restless Markov bandits
R Ortner, D Ryabko, P Auer, R Munos
Theoretical Computer Science 558, 62-76, 2014
552014
Regret bounds for reinforcement learning via markov chain concentration
R Ortner
Journal of Artificial Intelligence Research 67, 115-128, 2020
492020
A sliding-window algorithm for markov decision processes with arbitrarily changing rewards and transitions
P Gajane, R Ortner, P Auer
arXiv preprint arXiv:1805.10066, 2018
482018
Non-backtracking random walks and cogrowth of graphs
R Ortner, W Woess
Canadian Journal of Mathematics 59 (4), 828-844, 2007
462007
Improved learning complexity in combinatorial pure exploration bandits
V Gabillon, A Lazaric, M Ghavamzadeh, R Ortner, P Bartlett
Artificial Intelligence and Statistics, 1004-1012, 2016
452016
Pareto front identification from stochastic bandit feedback
P Auer, CK Chiang, R Ortner, M Drugan
Artificial intelligence and statistics, 939-947, 2016
442016
Improved regret bounds for undiscounted continuous reinforcement learning
K Lakshmanan, R Ortner, D Ryabko
International conference on machine learning, 524-532, 2015
442015
Pseudometrics for state aggregation in average reward Markov decision processes
R Ortner
Algorithmic Learning Theory: 18th International Conference, ALT 2007, Sendai …, 2007
392007
Adaptive aggregation for reinforcement learning in average reward Markov decision processes
R Ortner
Annals of Operations Research 208, 321-336, 2013
352013
Systemet kan ikke foretage handlingen nu. Prøv igen senere.
Artikler 1–20