Background Clustering of patients to find new “phenotypes” is now a fad. For example, repeating the false assertion that diabetes was ever a binary diagnosis, Ahlqvist et al claimed to have found 5 diabetes subtypes using a purely statistical analysis not driven by clinical knowledge. What they found is likely just inefficient prognostic stratification that could be improved upon by directly relating patient characteristics to outcomes. Maarten van Smeden showed that clustering algorithms...