Comparing Policies
Policy
p1
dominates
p2
if and only if:
There exists a state
s
such that
value(p1(s)) > value(p2(s))
For all states
s
,
value(p1(s)) ≥ value(p2(s))
Optimal policy
A policy
p
is optimal if there does not exist any policy that dominates it
(next)