Recently I got quite fascinated by integrating a variational auto-encoder 1 technique - or especially the reparameterization trick - within a larger computational graph in which I was trying to learn embeddings in a first stage and then try to find “blurry directions or regions” within that embeddings to navigate a larger model through an auto-regressive task. What I stumbled upon was that variational auto-encoder were usually used for discrete class targets but when changing the problem ...