After removing dropout from the input, and training for 193 epochs with three manual lowerings of the learning rate, I’ve now arrived at 2.06% training error and 3.32% validation error, using a full-sized SqueezeNet with CUBAN-style input layers. This is in contrast to the same network but with dropout applied on the input, which got … Continue reading New Success with SqueezeNet: 3.32% Validation Error