BigCodeDenseOpenRAIL BigCode License

StarCoder2 15B

StarCoder2 15B is a dense transformer language model from the BigCode family, containing 15B parameters across 40 layers. It supports up to 16K tokens of context with a hidden dimension of 6144 and 8 KV heads for efficient grouped-query att

15.0B

Parameters

16K

Max Context

Dense

Architecture

Released

Text

Modality

About StarCoder2 15B

StarCoder2 15B is a dense transformer language model from the BigCode family, containing 15B parameters across 40 layers. It supports up to 16K tokens of context with a hidden dimension of 6144 and 8 KV heads for efficient grouped-query attention (GQA). OpenRAIL BigCode license. Strong code completion with responsible-use clauses.

Code

Technical Specifications

Total Parameters15.0B
ArchitectureDense
Attention TypeGQA (Grouped Query Attention)
Hidden Dimensiond = 6,144
Transformer Layers40
Attention Heads48
KV Headsn_kv = 8
Head Dimensiond_head = 128
Activation FunctionSwiGLU
NormalizationRMSNorm
Position EmbeddingRoPE

System Requirements

Estimated VRAM at 10% overhead for different quantization methods and context sizes.

Quantization1K ctx16K ctx
Q4_K_M0.50 B/W
~97% of FP16
7.91Consumer GPU
10.25Consumer GPU
Q8_01.00 B/W
~100% of FP16
15.66Consumer GPU
18.01Consumer GPU
F162.00 B/W
Reference
31.17Datacenter GPU
33.51Datacenter GPU
Fits 24 GB consumer GPU
Fits 80 GB datacenter GPU
Requires cluster / multi-GPU

Other BigCode Models

View All

Find the right GPU for StarCoder2 15B

Use the interactive VRAM Calculator to see exactly how much memory you need at any quantization level, context length, and overhead setting.