A team I work with is very happy with the k8s / ArgoCD setup we set up, and now wants to manage their experimental ML workloads in k8s as well. These workloads run in Lambda Labs, who we use to quickly set up a few GPU-enabled Ubuntu machines in varying sizes. Those servers are not part of a managed k8s cluster, but instead you’re given SSH and Jupyter access and expected to deploy your workload yourself.