CSCI 335 - Programming Project #7
Due: Thursday, November 2 beginning of class
- Apply Q-Learning to automatically learn the following task:
- The simulated robot should drive forward as much as possible.
- It should avoid hitting objects.
- You will need to determine ideal values for:
- The discount rate γ
- The annealing schedule for α and ε
- The amount of time for the robot to run before stopping
- Presentations on 10/31 (intermediate) and 11/2 (final)
- Write a report detailing your results:
- Discuss evidence indicating that the learned behavior somehow represents an improvement on random action selection.