Measuring state of the art GPU performance compared to vLLM on Modular's MAX 24.6| www.modular.com
This past weekend, developers from across the AI and systems programming communities came together for Modular Hack Weekend: a global, virtual hackathon focused on GPU programming and model implementation with Mojo and MAX. Held primarily online through our Discord server and forum, this event brought fresh energy, bold ideas, and a powerful reminder of what this community can build in just 48 hours.| www.modular.com
We're excited to announce Modular Platform 25.4, a major release that brings the full power of AMD GPUs to our entire platform. This release marks a major leap toward democratizing access to high-performance AI by enabling seamless portability to AMD GPUs.| www.modular.com
Kubernetes-native cloud service that allows you to deploy, scale and manage your GenAI applications with State of the art performance| www.modular.com
A high-performance inference engine to build, optimize, and deploy AI apps fast. Run open models, scale across GPUs, and tap into CPU+GPU performance with Mojo.| www.modular.com
Introducing Mammoth, a distributed AI serving tool built specifically for the realities of enterprise AI deployment.| www.modular.com
Announcing Modular Platform 25.3: our largest open source release, with 450k+ lines of high-performance AI kernels, plus pip install modular.| www.modular.com
New licensing terms for MAX and Mojo that allows for unlimited non-commercial usage| www.modular.com
GenAI may be new, but GPUs aren’t! Over the years, many have tried to create portable GPU programming models using C++, from OpenCL to SYCL to OneAPI and beyond. These were the most plausible CUDA alternatives that aimed to democratize AI compute, but you may have never heard of them - because they failed to be relevant for AI.| www.modular.com
Answering the question of whether CUDA is “good” is much trickier than it sounds.| www.modular.com
Today, we're excited to announce the release of MAX 25.1, marking a significant evolution in our approach to delivering cutting-edge AI development tools to our community. This release substantially improves the developer experience for Agentic and LLM workflows, introduces a new nightly release model that includes a new GPU programming interface, and launches MAX Builds - your one-stop destination for GenAI development.| www.modular.com
If we as an ecosystem hope to make progress, we need to understand how the CUDA software empire became so dominant.| www.modular.com
It seems like everyone has started talking about CUDA in the last year: It’s the backbone of deep learning, the reason novel hardware struggles to compete, and the core of NVIDIA’s moat and soaring market cap. With DeepSeek, we got a startling revelation: its breakthrough was made possible by “bypassing” CUDA, going directly to the PTX layer… but what does this actually mean? It feels like everyone wants to break past the lock-in, but we have to understand what we’re up against be...| www.modular.com
Part 1 of an article that explores the future of hardware acceleration for AI beyond CUDA, framed in the context of the release of DeepSeek| www.modular.com
MAX 24.6 release bog featuring MAX GPU| www.modular.com
Platforms like TensorFlow, PyTorch, and CUDA do not focus on modularity - there, we said it! They are sprawling technologies with thousands of evolving interdependent pieces that have grown organically into complicated structures over time. AI software developers must deal with this sprawl while deploying workloads to server, mobile devices, microcontrollers, and web browsers using multiple hardware platforms and accelerators.| www.modular.com
A deep dive into the complexities of optimizing code for SIMD instruction sets across multiple platforms.| www.modular.com
At Modular, open source is ingrained in our DNA. We firmly believe for Mojo to reach its full potential, it must be open source. We have been progressively open-sourcing more of Mojo and parts of the MAX platform, and today we’re thrilled to announce the release of the core modules from the Mojo standard library under the Apache 2 license!| www.modular.com
The Modular Accelerated Xecution (MAX) platform is an integrated suite of tools for AI compute workloads across CPUs and NVIDIA and AMD GPUs.| www.modular.com
Mojo combines the usability of Python with the performance of C, unlocking unparalleled programmability of AI hardware and extensibility of AI models.| www.modular.com