We demonstrate how we can reduce model size by pruning un-needed neurons.| Alex Shtoff
We study a way to represent a tilted loss as an average of losses by lifting to a higher dimensional space, and employing regular SGD| Alex Shtoff