Qwen 3 8B
Qwen 3 8B is a major leap over Qwen 2.5 7B with hybrid reasoning capabilities — it can toggle between quick-thinking mode and extended chain-of-thought reasoning. At 8.19B parameters with 128K context and Apache 2.0 license, it is arguably …
8.2B
Parameters
128K
Max Context
Dense
Architecture
Apr 29, 2025
Released
Text
Modality
About Qwen 3 8B
Qwen 3 8B is a major leap over Qwen 2.5 7B with hybrid reasoning capabilities — it can toggle between quick-thinking mode and extended chain-of-thought reasoning. At 8.19B parameters with 128K context and Apache 2.0 license, it is arguably the best all-round 8B-class local model as of mid-2025. The thinking mode significantly improves math and logic performance at the cost of higher token usage. Excellent for both interactive chat and structured problem-solving.
Technical Specifications
System Requirements
Estimated VRAM at 10% overhead for different quantization methods and context sizes.
| Quantization | 1K ctx | 128K ctx |
|---|---|---|
Q4_K_M0.50 B/W ~97% of FP16 | 4.37Consumer GPU | 22.23Consumer GPU |
Q8_01.00 B/W ~100% of FP16 | 8.61Consumer GPU | 26.47Datacenter GPU |
F162.00 B/W Reference | 17.07Consumer GPU | 34.93Datacenter GPU |
Other Qwen Models
View AllFind the right GPU for Qwen 3 8B
Use the interactive VRAM Calculator to see exactly how much memory you need at any quantization level, context length, and overhead setting.