I don’t like it when people use logic to try and get me to do something I don’t want to do.| Home on GPU Utils ⚡️
Cancelling SaaS Subscriptions Sucks #Perhaps this could be a nice task for an AI agent product #This feels like a nicely shaped AI problem.| gpus.llm-utils.org
For now this is a public draft.| gpus.llm-utils.org
Here are the Stable Diffusion guides we have:| gpus.llm-utils.org
As of July 27 I think the docker templates are now fixed, so the below shouldn’t be needed.| gpus.llm-utils.org
Free playgrounds # 70B-chat by Yuvraj at Hugging Face: https://huggingface.| gpus.llm-utils.org
Summary # If you’re looking for a specific open-source LLM, you’ll see that there are lots of variations of it.| gpus.llm-utils.org
Name Price (Images Per Month) Easy LoRAs?| gpus.llm-utils.org
APIs are against Midjourney’s TOS.| gpus.llm-utils.org
Name Price Per Image Pricing Details 🏆 RunPod $0.| gpus.llm-utils.org
Name Price/Second for GPU GPU 🏆 Runpod.| gpus.llm-utils.org
Which GPU cloud should you use?| gpus.llm-utils.org
We made a tool: ComputeWatch.| gpus.llm-utils.org
Falcon-40B is interesting, but running it isn’t as easy as it could be.| gpus.llm-utils.org
Updates:| gpus.llm-utils.org
Aimed at enterprises.| gpus.llm-utils.org
Here’s our giant list of alternative GPU clouds.| gpus.llm-utils.org
vCPU: “A vCPU, or virtual CPU, is a unit of processing power that is allocated to a virtual machine (VM) or a piece of software running on a physical server.| gpus.llm-utils.org
SXM is a Nvidia specific alternative to PCIe.| gpus.llm-utils.org
There are 3 similarly named models, they are different and have different performance RTX 6000 (Quadro RTX 6000, 24 GB VRAM, launched Aug 13, 2018) RTX A6000 (48 GB VRAM, launched Oct 5, 2020) RTX 6000 Ada (48 GB VRAM, launched Dec 3, 2022)| gpus.llm-utils.org
What’s the difference between a DGX GH200, a GH200, and an H100?| gpus.llm-utils.org
GPU VRAM 16-bit inference perf rank Available on Runpod (Instant) Available on Lambda (Instant) Available on FluidStack (Instant) H100 80 GB 🏆 1 No ✅ $1.| gpus.llm-utils.org
Clusters of hundreds or thousands of H100s.| gpus.llm-utils.org
Note that this list is aimed at cloud GPUs where more expensive GPUs are comparatively cheap vs buying the whole GPU outright.| gpus.llm-utils.org
GPU Cost at Runpod Cost at FluidStack Cost at Lambda Labs 1x H100 Not available ✅ $1.| gpus.llm-utils.org
GPU Inference speed relative to 2x H100s (est) Speed / $ (relative est) Cost at Runpod Cost at FluidStack Cost at Lambda Labs 2x H100s 100% Not available Not available Not available in an on-demand 2x instance Not available in an on-demand 2x instance 2x 6000 Ada 48% 0.| gpus.llm-utils.org
Why does this exist?| gpus.llm-utils.org
To run Falcon-40B, 85GB+ of GPU ram is preferred.| gpus.llm-utils.org
What’s the prompt template best practice for prompting the Llama 2 chat models?| gpus.llm-utils.org
Ways LLMs could be used in existing products to add a tiny bit of convenience # As an aside, I’m considering writing a piece on “tiny ways LLMs could be used that customers would appreciate” - some examples are:| gpus.llm-utils.org
These are a few boring small needs that businesses have.| gpus.llm-utils.org
Firstly, which AI tools are worth running?| gpus.llm-utils.org
Availability # Lambda Labs - At least 1x (actual max unclear) H100 GPU instant access Max H100s avail: 60,000 with 3 year contract (min 1 GPU) Pre-approval requirements: Unknown, didn’t do the pre-approval.| gpus.llm-utils.org
Availability # FluidStack - 1 instance, max up to 25 GPUs on our account - instant access Max A100s avail: 2,500 GPUs (min 1 GPU) Pre-approval requirements: fill out a web form Pricing: $1.| gpus.llm-utils.org
Overall if you’re not stuck with your existing cloud, then I’d recommend FluidStack, Runpod, and Lambda Labs for GPUs.| gpus.llm-utils.org
This post is an exploration of the supply and demand of GPUs, particularly Nvidia H100s.| gpus.llm-utils.org