(A cautionary tale, illustrated via a toy example. If you do Bayesian hyperparameter tuning, you will want to know about this. Here’s a notebook with all of my code.) Here are two rotations of a function. We are going to examine how well a Gaussian Process (GP) models this function. I have provided example axis labels, but feel free to substitute your own. In my example scenario, we are training a neural network, first running it for some number of epochs at a high learning rate, then runni...