One of the bigger risks of iterative statistical or machine learning fitting procedures is over-fit or the dreaded data leak. Over-fit is when: a model performs better on training data than on future data. Some degree of over-fit is expected. A data leak is when: the model learns things about […]