HACKOBAR_item
[r/LocalLLaMA]score: 0.20

I have DeepSeek V4 Pro at home

May 10, 2026
A practitioner successfully ran DeepSeek V4 Pro quantized to Q4_K_M on a local EPYC workstation with 12x96GB RAM and a single RTX PRO 6000 Max-Q GPU, using a community-forked llama.cpp with CUDA and flash-attention support. This demonstrates that frontier-scale MoE models are now locally runnable with sufficient CPU RAM offloading.
other