Week 1 : IntroductionWeek 2 : Bandit algorithms – UCB, PACWeek 3: Bandit algorithms –Median Elimination, Policy GradientWeek 4: Full RL & MDPsWeek 5 : Bellman OptimalityWeek 6: Dynamic Programming & TD MethodsWeek 7 : Eligibility TracesWeek 8 : Function ApproximationWeek 9: Least Squares MethodsWeek 10: Fitted Q, DQN & Policy Gradient for Full RLWeek 11: Hierarchical RLWeek 12: POMDPs

Plataforma