Mar 5, 2026

Best Used GPU for Local LLMs: Why a Used RTX 3090 Beats a New RTX 4070

VRAM capacity matters more than raw compute for local LLMs. Used flagship GPUs deliver more VRAM per dollar than any new midrange card. Here is the math and the buying guide.

Best Used GPU for Local LLMs: Why a Used RTX 3090 Beats a New RTX 4070
A
Andre
GPUAILLMs
1.0

VRAM per dollar: used vs new

GPUVRAMUsed PriceBandwidthVRAM/$Verdict
RTX 409024 GB GDDR6X~$1,2001,008 GB/s20 MB/$Best used overall
RTX 309024 GB GDDR6X~$450936 GB/s53 MB/$Best value
RX 7900 XTX24 GB GDDR6~$600960 GB/s40 MB/$Best used AMD
RTX 4060 Ti 16 GB16 GB GDDR6~$350288 GB/s46 MB/$Avoid (slow)

The used RTX 3090 at ~$450 delivers 53 MB of VRAM per dollar - more than 2.5x the RTX 4090 and nearly 3x a new RTX 4060 Ti. For LLMs where VRAM is the bottleneck, this is the metric that matters most. The 3090 also offers 936 GB/s bandwidth, which is 3x the RTX 4060 Ti and enough for comfortable 35B model inference on llama.cpp and Ollama.

2.0

What 24 GB used buys that 16 GB new cannot

24 GB covers: 35B at Q4 (~20 GB), Mixtral 8x7B Q4 (~14 GB), Qwen 32B Q4 (~18 GB) 16 GB maxes out at: 13B at Q8 (~13 GB), 34B at Q3 (~14 GB, degraded quality) The 8 GB gap = the entire 35B model tier

A new RTX 4070 Ti Super at $800 gives you 16 GB with a warranty. A used RTX 3090 at $450 gives you 24 GB with CUDA. The 50% more VRAM opens up Mixtral 8x7B, Qwen 32B, and Command R 35B at Q4 - models that simply do not fit in 16 GB. For local LLM use, the used card is the better tool.

3.0

Used GPU buying checklist

1. Test VRAM

Run CUDA memtest for 15+ minutes. Bad VRAM shows errors quickly. This is the most important test for LLM use.

2. Check thermals

Run a 30-minute stress test. RTX 3090/4090 should stay under 95C hotspot. Higher = bad thermal paste or fan issues.

3. Inspect physically

Look for bent pins, damaged ports, PCB damage. Verify fan spin is smooth with no rattling.

4. Buy with protection

Use eBay, Swappa, or platforms with buyer protection. Avoid cash-only deals unless you can test in person.

5. Benchmark immediately

Run a known LLM benchmark (e.g., llama.cpp prompt eval + generation). Compare speed to expected values for that GPU.

4.0

When used does not make sense

  • -You need 32 GB. No used consumer card has 32 GB. The RTX 5090 is the only option.
  • -You cannot risk downtime. Used cards have no warranty. If it fails, you buy another one.
  • -Your PSU is under 650 W. Used flagships (3090, 4090) draw 350-450 W. Factor in a PSU upgrade cost.
  • -You are on Windows with AMD. ROCm Windows support is immature. Use Linux or buy NVIDIA.

For the full GPU lineup at every price point, see Best GPU for Local LLMs. Before buying a used card, use our VRAM Calculator to confirm your target model fits in the VRAM of the card you are considering.

FAQ

Frequently Asked Questions

Is buying a used GPU safe for local LLM workloads?
Yes, with caveats. Mining cards may have worn fans and dried thermal paste, but VRAM chips are robust. Gaming cards are generally in better condition. Run CUDA memtest or a sustained VRAM stress test for 15+ minutes on arrival.
Should I worry about VRAM degradation on used cards?
GDDR6/GDDR6X VRAM is very durable. Degradation is extremely rare. The main risk is fan failure, which is cheap to replace.
How much should I pay for a used RTX 4090?
$1,100-1,400 is the typical range in 2026. Below $1,000 is a great deal. Above $1,500 is too close to RTX 5090 pricing.

End of Document

Reader Discussion

Be the first to add a note to this article.

Please log in to join the discussion.

No comments yet.

Back to all articles
Share this article