In late August, AMD and TensorWave reached out to collaborate on a presentation for AMD’s Media Tech Day—they asked if we could demo MAX on AMD Instinct™ MI355 on September 16th. There was just one problem: no one at Modular had access to an MI355.| www.modular.com
Modular Raises $250M to scale AI's Unified Compute Layer| Modular Blog
Modular 25.6: Unifying the latest GPUs from NVIDIA, AMD, and Apple| Modular Blog
Modverse #51: Modular x Inworld x Oracle, Modular Meetup Recap and Community Projects| Modular Blog
Matrix Multiplication on Blackwell: Part 3 - The Optimizations Behind 85% of SOTA Performance| Modular Blog
Matrix Multiplication on Blackwell: Part 2 - Using Hardware Features to Optimize Matmul| Modular Blog
Matrix Multiplication on Blackwell: Part 1 - Introduction| Modular Blog
Modverse #50: Modular Platform 25.5, Community Meetups, and Mojo's Debut in the Stack Overflow Developer Survey| Modular Blog
Modular Platform 25.5: Introducing Large Scale Batch Inference| Modular Blog
SF Compute and Modular Partner to Revolutionize AI Inference Economics| Modular Blog
AI Agents for AWS Marketplace| Modular Blog
Modverse #49: Modular Platform 25.4, Modular 🤝 AMD, and Modular Hack Weekend| Modular Blog
How is Modular Democratizing AI Compute? (Democratizing AI Compute, Part 11)| Modular Blog
Modular 25.4: One Container, AMD and NVIDIA GPUs, No Lock-In| Modular Blog
Modular + AMD: Unleashing AI performance on AMD GPUs| Modular Blog
Introducing Mammoth: Enterprise-Scale GenAI Deployments Made Simple| Modular Blog
Modverse #48: Modular Platform 25.3, MAX AI Kernels, and the Modular GPU Kernel Hackathon| Modular Blog
Exploring Metaprogramming in Mojo| Modular Blog
Modular GPU Kernel Hackathon Highlights: Innovation, Community, & Mojo🔥| Modular Blog
Modular’s bet to break out of the Matrix (Democratizing AI Compute, Part 10)| Modular Blog
Why do HW companies struggle to build AI software? (Democratizing AI Compute, Part 9)| Modular Blog
Modverse #47: MAX 25.2 and an evening of GPU programming at Modular HQ| Modular Blog
What about the MLIR compiler infrastructure? (Democratizing AI Compute, Part 8)| Modular Blog
What about Triton and Python eDSLs? (Democratizing AI Compute, Part 7)| Modular Blog
MAX 25.2: Unleash the power of your H200's–without CUDA!| Modular Blog
What about TVM, XLA, and AI compilers? (Democratizing AI Compute, Part 6)| Modular Blog
Modverse #46: MAX 25.1, MAX Builds, and Democratizing AI Compute| Modular Blog
Paged Attention & Prefix Caching Now Available in MAX Serve| Modular Blog
Agentic Building Blocks: Creating AI Agents with MAX Serve and OpenAI Function Calling| Modular Blog
Use MAX with Open WebUI for RAG and Web Search| Modular Blog
Hands-on with Mojo 24.6| Modular Blog
Evaluating Llama Guard with MAX 24.6 and Hugging Face| Modular Blog
Build a Continuous Chat Interface with Llama 3 and MAX Serve| Modular Blog
Community Spotlight: Writing Mojo with Cursor| Modular Blog
Hands-on with Mojo 24.5| Modular Blog
MAX 24.5 - With SOTA CPU Performance for Llama 3.1| Modular Blog
Announcing stack-pr: an open source tool for managing stacked PRs on GitHub| Modular Blog
Debugging in Mojo🔥| Modular Blog
Develop locally, deploy globally| Modular Blog
Take control of your AI| Modular Blog
Bring your own PyTorch model| Modular Blog
A brief guide to the Mojo n-body example| Modular Blog
What's new in MAX 24.4? MAX on macOS, fast local Llama3, native quantization and GGUF support| Modular Blog
What’s new in Mojo 24.4? Improved collections, new traits, os module features and core language enhancements| Modular Blog
MAX 24.4 - Introducing quantization APIs and MAX on macOS| Modular Blog
Deep dive into ownership in Mojo| Modular Blog
What ownership is really about: a mental model approach| Modular Blog
Fast⚡k-means clustering in Mojo🔥: a guide to porting Python to Mojo🔥 for accelerated k-means clustering| Modular Blog
Developer Voices: Deep Dive with Chris Lattner on Mojo| Modular Blog
What’s New in Mojo 24.3: Community Contributions, Pythonic Collections and Core Language Enhancements| Modular Blog
MAX 24.3 - Introducing MAX Engine Extensibility| Modular Blog
Row-major vs. Column-major Matrices: A Performance Analysis in Mojo and NumPy| Modular Blog
What’s new in Mojo 24.2: Mojo Nightly, Enhanced Python Interop, OSS stdlib and more| Modular Blog
MAX 24.2 is Here! What’s New?| Modular Blog
Semantic Search with MAX Engine| Modular Blog
How to Be Confident in Your Performance Benchmarking| Modular Blog
Mojo🔥 ❤️ Pi 🥧: Approximating Pi with Mojo🔥 using Monte Carlo methods| Modular Blog
Evaluating MAX Engine inference accuracy on the ImageNet dataset| Modular Blog
Announcing MAX Developer Edition Preview| Modular Blog
Getting started with MAX Developer Edition| Modular Blog
MAX is here! What does that mean for Mojo🔥?| Modular Blog
What are dunder methods? A guide in Mojo🔥| Modular Blog
Mojo🔥 ♥️ Python: Calculating and plotting a Valentine’s day ♥️ using Mojo and Python| Modular Blog
Mojo vs. Rust: what are the differences?| Modular Blog
What is loop unrolling? How you can speed up Mojo🔥 code with @unroll| Modular Blog
Mojo🔥 SDK v0.7 now available for download!| Modular Blog
Mojo 🔥 lightning talk ⚡️ one language for all AI programming!| Modular Blog
Modular to bring NVIDIA Accelerated Computing to the MAX Platform| Modular Blog
Modular partners with Amazon Web Services (AWS) to bring MAX to AWS services| Modular Blog
Key announcements from ModCon 2023| Modular Blog
Mojo 🔥 Traits Have Arrived!| Modular Blog
Mojo 🔥 Advent of Code 2023| Modular Blog
ModCon 2023 sessions you don’t want to miss!| Modular Blog
ModCon Mojo 🔥 Contest| Modular Blog
What’s new in Mojo SDK v0.5?| Modular Blog
Welcome Mostafa Hagog to Modular| Modular Blog
Mojo🔥 is now available on Mac| Modular Blog
Mojo 🔥 - A systems programming language presented at LLVM 2023| Modular Blog
Community Spotlight: How I built llama2.🔥 by Aydyn Tairov| Modular Blog
Using Mojo🔥 with Python🐍| Modular Blog
How to setup a Mojo🔥 development environment with Docker containers| Modular Blog
AI Regulation: step with care, and great tact| Modular Blog
Mojo🔥 - It’s finally here!| Modular Blog
We’ve raised $100M to fix AI infrastructure for the world's developers| Modular Blog
An easy introduction to Mojo🔥 for Python programmers| Modular Blog
What’s the difference between the AI Engine and Mojo?| Modular Blog
Today’s AI infrastructure is difficult to evaluate - so many converge on simple and quantifiable metrics like QPS, Latency and Throughput. This is one reason why today’s AI industry is rife with bespoke tools that provide high performance on benchmarks but have significant usability challenges in real-world AI deployment scenarios.| www.modular.com
A high-performance inference engine to build, optimize, and deploy AI apps fast. Run open models, scale across GPUs, and tap into CPU+GPU performance with Mojo.| www.modular.com
In this blog post, we’ll continue our journey to build a state-of-the-art (SOTA) matmul kernel on NVIDIA Blackwell by exploring the cluster launch control (CLC) optimization. At the end of the post we’ll improve our performance by another 15% and achieve 1772 TFLOPs, exceeding that of the current SOTA.| www.modular.com
Measuring state of the art GPU performance compared to vLLM on Modular's MAX 24.6| www.modular.com
This past weekend, developers from across the AI and systems programming communities came together for Modular Hack Weekend: a global, virtual hackathon focused on GPU programming and model implementation with Mojo and MAX. Held primarily online through our Discord server and forum, this event brought fresh energy, bold ideas, and a powerful reminder of what this community can build in just 48 hours.| www.modular.com
Announcing Modular Platform 25.3: our largest open source release, with 450k+ lines of high-performance AI kernels, plus pip install modular.| www.modular.com
New licensing terms for MAX and Mojo that allows for unlimited non-commercial usage| www.modular.com
GenAI may be new, but GPUs aren’t! Over the years, many have tried to create portable GPU programming models using C++, from OpenCL to SYCL to OneAPI and beyond. These were the most plausible CUDA alternatives that aimed to democratize AI compute, but you may have never heard of them - because they failed to be relevant for AI.| www.modular.com
Answering the question of whether CUDA is “good” is much trickier than it sounds.| www.modular.com
Today, we're excited to announce the release of MAX 25.1, marking a significant evolution in our approach to delivering cutting-edge AI development tools to our community. This release substantially improves the developer experience for Agentic and LLM workflows, introduces a new nightly release model that includes a new GPU programming interface, and launches MAX Builds - your one-stop destination for GenAI development.| www.modular.com
If we as an ecosystem hope to make progress, we need to understand how the CUDA software empire became so dominant.| www.modular.com
It seems like everyone has started talking about CUDA in the last year: It’s the backbone of deep learning, the reason novel hardware struggles to compete, and the core of NVIDIA’s moat and soaring market cap. With DeepSeek, we got a startling revelation: its breakthrough was made possible by “bypassing” CUDA, going directly to the PTX layer… but what does this actually mean? It feels like everyone wants to break past the lock-in, but we have to understand what we’re up against be...| www.modular.com
Part 1 of an article that explores the future of hardware acceleration for AI beyond CUDA, framed in the context of the release of DeepSeek| www.modular.com
MAX 24.6 release bog featuring MAX GPU| www.modular.com