1. Intro Overall, this has been an incredibly instructive journey in the world of feed-forward CNNs. Aside from developing an excellent cat vs. dog classifier called KITNN, I’ve learnt a lot and accomplished several personal milestones in my personal development as a researcher. 2. What I Sought to Do Early on, I decided I’d like … Continue reading Final Summary| Olexa Bilaniuk's IFT6266H16 Course Blog
After appropriately patching Keras, I proceeded with PReLUs. Unfortunately, I’ve not matched plain ReLUs for this task, albeight I came close. At Epoch 61, I scored a low of 1.72% training and 2.84% validation error, passably good but not as good as the 2.48% validation error my plain ReLU network achieved. The training curves can … Continue reading Close, but no cigar: PReLUs give ~2.84%| Olexa Bilaniuk's IFT6266H16 Course Blog
In my previous post, I had brought up as a possible research direction that the SqueezeNet authors had used sparsification, which broadly speaking involves: Train the neural network until it stops …| Olexa Bilaniuk's IFT6266H16 Course Blog
After removing dropout from the input, and training for 193 epochs with three manual lowerings of the learning rate, I’ve now arrived at 2.06% training error and 3.32% validation error, using a full-sized SqueezeNet with CUBAN-style input layers. This is in contrast to the same network but with dropout applied on the input, which got … Continue reading New Success with SqueezeNet: 3.32% Validation Error| Olexa Bilaniuk's IFT6266H16 Course Blog
In a previous post I had described how I had stripped down SqueezeNet in order to make it fit on my Nvidia GTX 765M card. Adding the CUBAN input layers more than tripled the memory requirements of the first convolution, and put me at the very edge of available memory on my GPU. Desirous to … Continue reading Trying to Scale Up to Full-Size SqueezeNet| Olexa Bilaniuk's IFT6266H16 Course Blog
As of the end of Epoch 27, I’ve now achieved 5.76% validation and 5.08444% training error rates on the dataset using the network I talked about in my previous post. While I did babysit the network and at intervals interrupted the training and sped up the decay of the learning rate, for the most part … Continue reading CUBAN-Style Input Layers: 5.76% Validation Error| Olexa Bilaniuk's IFT6266H16 Course Blog
Two weeks ago, I realized that my pure-Theano code simply wasn’t scaling. It was too difficult to test anything at all, because it required changes all over the place. I would also have to im…| Olexa Bilaniuk's IFT6266H16 Course Blog
A well of random thoughts| Olexa Bilaniuk's IFT6266H16 Course Blog
Visit wp.com/app, or scan the code with your mobile device| obilaniu6266h16.wordpress.com
10 posts published by Olexa Bilaniuk during April 2016| Olexa Bilaniuk's IFT6266H16 Course Blog
El Problemo. As Guillaume Berger warned me of here, it turns out that when you ask Keras to do PReLUs, it allocates one learnable “alpha” parameter per neuron! Consequence: I went from …| Olexa Bilaniuk's IFT6266H16 Course Blog
First, an idea that didn’t work. I hypothesized that images shouldn’t need to be rotated as far as 60 degrees and that 30 degrees might be enough. Wrong. This caused training to plateau…| Olexa Bilaniuk's IFT6266H16 Course Blog
I’ve just created a new SqueezeNet-based model with Keras, which reuses some of my ideas from CUBAN, my IFT6390 project. In that project I had been using: “Fine” filters, run on t…| Olexa Bilaniuk's IFT6266H16 Course Blog