User-uploaded documents have been a core component of Scribd’s business from the very beginning, understanding what is actually in the document corpus unlocks exciting new opportunities for discovery and recommendation. With Scribd anybody can upload and share documents, analogous to YouTube and videos. Over the years, our document corpus has become larger and more diverse which has made understanding it an ever-increasing challenge. Over the past year one of the missions of the Applied Res...