[r/LocalLLaMA]score: 0.20

I have DeepSeek V4 Pro at home

May 10, 2026

A practitioner successfully ran DeepSeek V4 Pro quantized to Q4_K_M on a local EPYC workstation with 12x96GB RAM and a single RTX PRO 6000 Max-Q GPU, using a community-forked llama.cpp with CUDA and flash-attention support. This demonstrates that frontier-scale MoE models are now locally runnable with sufficient CPU RAM offloading.

other

SOURCE

https://www.reddit.com/r/LocalLLaMA/comments/1t94ito/i_have_deepseek_v4_pro_at_home/

← back to feed