CSCI 335 - Programming Project #7

Fall 2017

robosim.zip
Apply Q-Learning to automatically learn the following task:
- The simulated robot should drive forward as much as possible.
- It should avoid hitting objects.
You will need to determine ideal values for:
- The discount rate γ
- The annealing schedule for α and ε
- The amount of time for the robot to run before stopping
Presentations on 10/31 (intermediate) and 11/2 (final)
Write a report detailing your results:
- Discuss evidence indicating that the learned behavior somehow represents an improvement on random action selection.