According to Google's developer blog, Google DeepMind has released DiffusionGemma, a 26-billion-parameter open-weight model that generates text using diffusion — the same technique behind AI image generation — rather than the token-by-token approach used by most large language models. The model uses a mixture-of-experts architecture with 25.2 billion total parameters and roughly 3.8 billion active parameters, and generates 256-token blocks in parallel, making it up to four times faster than comparable Gemma 4 models. It is available under an Apache 2.0 license. The trade-off is quality: on published benchmarks, DiffusionGemma scores below standard Gemma 4. Google positions it for developers exploring speed-critical and interactive local workflows. Nvidia has also announced optimizations to run DiffusionGemma locally on RTX hardware.
NewsAI-assisted
Google Open-Sources 4x Faster Text Model DiffusionGemma
Google's new 26B open model generates text 4x faster using diffusion, not autoregressive decoding.