But it may well be semi-supervised. For some time now I have thought that building a latent representation of protein sequence space is a really good idea, both because we have far more sequences than any form of labelled data, and because, once built, such a representation can inform a broad range of downstream tasks. … Continue reading →