LLM VRAM Calculator
Estimate how much GPU memory a local language model needs for weights, KV cache, and runtime overhead.
Configuration
Total VRAM
5.15 GB
Model Weights
3.74 GB
KV Cache
1.00 GB
Memory Breakdown
Model Weights3.74 GB
8.0B params × 0.5 bytes/weight
KV Cache1.00 GB
8,192 tokens · 125.00 MB per 1K tokens
System Overhead0.37 GB
10% of model weights (cuBLAS + workspace)
Scratchpad0.04 GB
~1% of weights (temporary tensors)
Compatible GPUs
Sorted by the smallest VRAM capacity that fits.
AMD
Radeon RX 7600
8 GB VRAM$259.99
NVIDIA
GeForce RTX 4060
8 GB VRAM$299.99
NVIDIA
GeForce RTX 5060
8 GB VRAM$299.99
AMD
Radeon RX 9060 XT
8 GB VRAM$349.99
NVIDIA
GeForce RTX 4060 Ti
8 GB VRAM$399.99
Sparkle
Sparkle Intel Arc B570 Guardian OC 10GB
10 GB VRAM$331.32
NVIDIA
GeForce RTX 5070
12 GB VRAM$549.99
AMD
Radeon RX 7800 XT
16 GB VRAM$499.99