Gemma 3 12B
Gemma 3 12B is Google's strongest mid-size open model. At 12B parameters with vision support, 32K context, and strong multilingual performance across 140+ languages, it delivers exceptional quality for its size. Uses GeGLU activation (unlik…
12.0B
Parameters
32K
Max Context
Dense
Architecture
Mar 12, 2025
Released
Text + Vision
Modality
About Gemma 3 12B
Gemma 3 12B is Google's strongest mid-size open model. At 12B parameters with vision support, 32K context, and strong multilingual performance across 140+ languages, it delivers exceptional quality for its size. Uses GeGLU activation (unlike most models using SwiGLU) and a unique head_dim=160 design. At ~6.5 GB VRAM at Q4_K_M it fits on 8 GB GPUs. The Gemini-derived training recipe gives it strong instruction following and safety alignment.
Technical Specifications
System Requirements
Estimated VRAM at 10% overhead for different quantization methods and context sizes.
| Quantization | 1K ctx | 32K ctx |
|---|---|---|
Q4_K_M0.50 B/W ~97% of FP16 | 6.38Consumer GPU | 11.83Consumer GPU |
Q8_01.00 B/W ~100% of FP16 | 12.58Consumer GPU | 18.03Consumer GPU |
F162.00 B/W Reference | 24.99Datacenter GPU | 30.44Datacenter GPU |
Other Gemma Models
View AllFind the right GPU for Gemma 3 12B
Use the interactive VRAM Calculator to see exactly how much memory you need at any quantization level, context length, and overhead setting.