Slotted ALOHA Based on Reinforcement Learning with Thompson Sampling

Author(s):  
Yeong-Je Jo ◽  
Gyung-Ho Hwang
Author(s):  
Jan Leike ◽  
Tor Lattimore ◽  
Laurent Orseau ◽  
Marcus Hutter

We discuss some recent results on Thompson sampling for nonparametric reinforcement learning in countable classes of general stochastic environments. These environments can be non-Markovian, non-ergodic, and partially observable. We show that Thompson sampling learns the environment class in the sense that (1) asymptotically its value converges in mean to the optimal value and (2) given a recoverability assumption regret is sublinear. We conclude with a discussion about optimality in reinforcement learning.


Author(s):  
Ibrahim Ayoub ◽  
Iman Hmedoush ◽  
Cedric Adjih ◽  
Kinda Khawam ◽  
Samer Lahoud

Author(s):  
Molly Zhang ◽  
Luca de Alfaro ◽  
J.J. Garcia-Luna-Aceves

Decision ◽  
2016 ◽  
Vol 3 (2) ◽  
pp. 115-131 ◽  
Author(s):  
Helen Steingroever ◽  
Ruud Wetzels ◽  
Eric-Jan Wagenmakers

Sign in / Sign up

Export Citation Format

Share Document