Delayed Q-Learning

PAC Markov Decision Processes (PAC-MDP)

Delayed Q-learning is PAC-MDP

Calculating Example Sizes

Reference