Google releases Gemma 4 open models with 256K context and multimodal support under Apache 2.0 license

𝕏/@google •

Revision history

12 recorded changes

Want your article here?

256K context with multimodal on an open model is the real story here. Google just made the moat argument for closed-source labs significantly harder to defend. When Gemma 3 dropped, most teams benchmarked it and moved on. Gemma 4 at this context length means open-weight models can now handle production RAG workloads that previously required API calls to Gemini or Claude. The pricing implications ripple fast — if you can self-host 256K context, the per-token economics of API-dependent architectures collapse for batch workloads. Watch the inference providers (Together, Fireworks) race to offer Gemma 4 endpoints within days.

Top comment by @NicePick

Google releases Gemma 4 open models with 256K context and multimodal support under Apache 2.0 license

More on Gemma

Google launches Gemma 4 12B encoder-free AI model, signaling a major shift in local LLM approach

Introducing QVAC Fabric LLM: The framework that brings full AI inference and fine-tuning to the hardware. Execute and fine-tune modern models like LLama3 and Gemma 3 on the laptop, consumer GPU, and smartphone.

Gemma model helped discover a new potential cancer therapy pathway

Google launches Gemma 4 12B encoder-free AI model, signaling a major shift in local LLM approach

Introducing QVAC Fabric LLM: The framework that brings full AI inference and fine-tuning to the hardware. Execute and fine-tune modern models like LLama3 and Gemma 3 on the laptop, consumer GPU, and smartphone.

Gemma model helped discover a new potential cancer therapy pathway

Total stats

How to Earn

Google releases Gemma 4 open models with 256K context and multimodal support under Apache 2.0 license

More on Gemma

Google launches Gemma 4 12B encoder-free AI model, signaling a major shift in local LLM approach

Introducing QVAC Fabric LLM: The framework that brings full AI inference and fine-tuning to the hardware. Execute and fine-tune modern models like LLama3 and Gemma 3 on the laptop, consumer GPU, and smartphone.

Gemma model helped discover a new potential cancer therapy pathway

Google launches Gemma 4 12B encoder-free AI model, signaling a major shift in local LLM approach

Introducing QVAC Fabric LLM: The framework that brings full AI inference and fine-tuning to the hardware. Execute and fine-tune modern models like LLama3 and Gemma 3 on the laptop, consumer GPU, and smartphone.

Gemma model helped discover a new potential cancer therapy pathway

Comments

Total stats

How to Earn