●builderYou can begin experimenting with local DeepSeek V4 Flash inference today via the PR, but expect instability and slow throughput until GPU support lands.
●researcherEarly quantization results suggest V4 Flash has unusually high quantization robustness for its size class, worth tracking as the PR matures.