We’re excited to introduce Unsloth Flex Attention support for OpenAI gpt-oss training that enables >8× longer context lengths, >50% less VRAM usage and >1.5× faster training (with no accuracy degradation) vs. all implementations including those using Flash Attention 3 (FA3). Unsloth Flex Attention makes it possible to train with a 60K context length on a 80GB VRAM H100 GPU for BF16 LoRA. Also:| docs.unsloth.ai
Run & fine-tune OpenAI's new open-source models!| docs.unsloth.ai
We're excited to introduce more efficient reinforcement learning (RL) in Unsloth with multiple algorithmic advancements:| docs.unsloth.ai
Learn all about Reinforcement Learning (RL) and how to train your own DeepSeek-R1 reasoning model with Unsloth using GRPO. A complete guide from beginner to advanced.| docs.unsloth.ai
You can now train OpenAI gpt-oss with RL and GRPO via Unsloth. Unsloth now offers the fastest inference (3x faster), lowest VRAM (50% less) and most context (8x longer) for gpt-oss RL vs. any implementation - with no accuracy loss.| docs.unsloth.ai
Train your own model with Unsloth, an open-source framework for LLM fine-tuning and reinforcement learning.| docs.unsloth.ai
Learn the fundamentals and customization options of chat templates, including Conversational, ChatML, ShareGPT, Alpaca formats, and more!| docs.unsloth.ai