Learning Decision Trees

Start with a single root node containing all training examples.
While there are any leaf nodes containing examples with different labels:
- Arbitrarily select a feature for which some examples have different values
- Replace the current leaf node with an interior node and two child leaf nodes:
  - Use the arbitrarily selected feature as the interior node's feature.
  - Place all examples where the feature is false in the left child.
  - Place all other examples in the right child.

Computational Complexity

How do we measure input size?
What if we want the absolute minimum tree nodes?