Authors: Daniel Vega-Myhre (Google), Abdullah Gharaibeh (Google), Kevin Hannon (Red Hat) In this article, we introduce JobSet, an open source API for representing distributed jobs. The goal of JobSet is to provide a unified API for distributed ML training and HPC workloads on Kubernetes. Why JobSet? The Kubernetes community’s recent enhancements to the batch ecosystem on Kubernetes has attracted ML engineers who have found it to be a natural fit for the requirements of running distributed t...| Kubernetes
In this SIG etcd spotlight we talked with James Blair, Marek Siarkowicz, Wenjia Zhang, and Benjamin Wang to learn a bit more about this Kubernetes Special Interest Group. Introducing SIG etcd Frederico: Hello, thank you for the time! Let’s start with some introductions, could you tell us a bit about yourself, your role and how you got involved in Kubernetes. Benjamin: Hello, I am Benjamin. I am a SIG etcd Tech Lead and one of the etcd maintainers.| Kubernetes
Custom resources are extensions of the Kubernetes API. This page discusses when to add a custom resource to your Kubernetes cluster and when to use a standalone service. It describes the two methods for adding custom resources and how to choose between them. Custom resources A resource is an endpoint in the Kubernetes API that stores a collection of API objects of a certain kind; for example, the built-in pods resource contains a collection of Pod objects.| Kubernetes
This page shows how to configure liveness, readiness and startup probes for containers. For more information about probes, see Liveness, Readiness and Startup Probes The kubelet uses liveness probes to know when to restart a container. For example, liveness probes could catch a deadlock, where an application is running, but unable to make progress. Restarting a container in such a state can help to make the application more available despite bugs.| Kubernetes