RL_Exercises

Reinforcement learning exercises from R.S. Sutton & A. Barto's "Reinforcement Learning: An Introduction" (1992)

Finding optimal strategy for Jack which gives optimal reward (please refer to the book for details of the problem).

where the heatmaps are through Day 0 ~ 5.

$n_1$: Number of cars at parking lot 1.

$n_2$: Number of cars at parking lot 2.

The colors represent the number of cars to be moved from lot 1 to 2.

Uses Sarsa on-policy TD algorithm to find the quickest route to the goal when wind is blowing upwards.

The color represents steps.

Using TD($\lambda$) with continuous state (discrete action) to find optimal policy for car to reach the goal.

After 10 episodes

After 100 episodes

After 1000 episodes

Value function (after 100 episodes)

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Black_Jack(Ch5)		Black_Jack(Ch5)
Jacks_Car_Rental(Ch4)		Jacks_Car_Rental(Ch4)
Mountain_Car(Ch9)		Mountain_Car(Ch9)
Windy_Gridworld(Ch6)		Windy_Gridworld(Ch6)
README.md		README.md

Provide feedback