MistralDenseApache 2.0

Ministral 3 8B

Ministral 3 8B is a cascade-distilled compact model derived from Mistral Small 3.1. It inherits much of the 24B teacher's quality at roughly one-third the size. At ~4 GB VRAM at Q4_K_M, it delivers near-13B-class performance in the 8B weigh

8.0B

Parameters

256K

Max Context

Dense

Architecture

Sep 16, 2025

Released

Text + Vision

Modality

About Ministral 3 8B

Ministral 3 8B is a cascade-distilled compact model derived from Mistral Small 3.1. It inherits much of the 24B teacher's quality at roughly one-third the size. At ~4 GB VRAM at Q4_K_M, it delivers near-13B-class performance in the 8B weight class. Apache 2.0 licensed with vision support and 262K context. An excellent choice for laptop and mid-range GPU deployment.

General PurposeCodeVisionLaptopCommercial

Technical Specifications

Total Parameters8.0B
ArchitectureDense
Attention TypeGQA (Grouped Query Attention)
Hidden Dimensiond = 4,096
Transformer Layers34
Attention Heads32
KV Headsn_kv = 8
Head Dimensiond_head = 128
Activation FunctionSwiGLU
NormalizationRMSNorm
Position EmbeddingRoPE

System Requirements

Estimated VRAM at 10% overhead for different quantization methods and context sizes.

Quantization1K ctx195K ctx256K ctx
Q4_K_M0.50 B/W
~97% of FP16
4.27Consumer GPU
30.08Datacenter GPU
38.14Datacenter GPU
Q8_01.00 B/W
~100% of FP16
8.40Consumer GPU
34.21Datacenter GPU
42.27Datacenter GPU
F162.00 B/W
Reference
16.67Consumer GPU
42.48Datacenter GPU
50.54Datacenter GPU
Fits 24 GB consumer GPU
Fits 80 GB datacenter GPU
Requires cluster / multi-GPU

Other Mistral Models

View All

Find the right GPU for Ministral 3 8B

Use the interactive VRAM Calculator to see exactly how much memory you need at any quantization level, context length, and overhead setting.