Qwen 2.5 7B
Qwen 2.5 7B is Alibaba's workhorse model — competitive with Llama 3.1 8B on most benchmarks with the advantage of Apache 2.0 licensing. Strong multilingual support across 29 languages, particularly excellent for Chinese and English. 32K def…
7.6B
Parameters
128K
Max Context
Dense
Architecture
Sep 19, 2024
Released
Text
Modality
About Qwen 2.5 7B
Qwen 2.5 7B is Alibaba's workhorse model — competitive with Llama 3.1 8B on most benchmarks with the advantage of Apache 2.0 licensing. Strong multilingual support across 29 languages, particularly excellent for Chinese and English. 32K default context with 128K via YaRN extension. At ~4 GB VRAM at Q4_K_M it fits on any modern GPU. The Qwen 2.5 family has extensive fine-tune and GGUF availability, making it a top recommendation alongside Llama 3.1 8B.
Technical Specifications
System Requirements
Estimated VRAM at 10% overhead for different quantization methods and context sizes.
| Quantization | 1K ctx | 128K ctx |
|---|---|---|
Q4_K_M0.50 B/W ~97% of FP16 | 3.99Consumer GPU | 10.93Consumer GPU |
Q8_01.00 B/W ~100% of FP16 | 7.92Consumer GPU | 14.87Consumer GPU |
F162.00 B/W Reference | 15.79Consumer GPU | 22.73Consumer GPU |
Other Qwen Models
View AllFind the right GPU for Qwen 2.5 7B
Use the interactive VRAM Calculator to see exactly how much memory you need at any quantization level, context length, and overhead setting.