• Cross validation: allows us to compare different machine learning methods and get a sense of how well they will work in practice.
  • We need to train the machine learning method, and test.
    • We need to save data to test.

If we use, for example, 25% for testing and 75% for training, how do we know “which 25%” to pick? The first 25%? The middle?

  • Cross validation uses them all, one at a time, and summarises the results at the end.