Probable Approximate Correctness (PAC)

Big Question:
- How many examples must we use for training?
Alternative formulation: Is a learned function any good?
- If it is bad, we will find out with high probability after a small number of examples.
- If it is consistent with a large set of training examples, it is unlikely to be wrong.
Important assumption:
- The test set and training set are drawn randomly from X with the same probability distribution.
Approximate correctness of concept learning:
- Let Pr(X) be the probability distribution
- Let D = {x | (f(x) = 0 and x in C) or (f(x) = 1 and not (x in C)}
  - In other words, D is the set of all incorrect classifications
- Let Error(f) be the sum, for all x in D, of Pr(x)
- We say that f is approximately correct with accuracy ε if and only if Error(f) ≤ ε
Probable approximate correctness
- f is Probably Approximately Correct (PAC) with probability 1-δ if and only if Pr(Error(f) > ε) < δ