Login
From:
Tom's Hardware
(Uncensored)
subscribe
Alibaba Cloud says it cut Nvidia AI GPU use by 82% with new pooling system— up to 9x increase in output lets 213 GPUs perform like 1,192 | Tom's Hardware
https://www.tomshardware.com/tech-industry/semiconductors/alibaba-says-new-pooling-system-cut-nvidia-gpu-use-by-82-percent
links
backlinks
A paper presented at SOSP 2025 details how token-level scheduling helped one GPU serve multiple LLMs, reducing demand from 1,192 to 213 H20s.
Roast topics
Find topics
Roast it!
Roast topics
Find topics
Find it!
Roast topics
Find topics
Find it!