●builderYou can fine-tune MLLMs served via vLLM without modifying the compiled graph — directly relevant if LoRA incompatibility with your inference stack has been a blocker.
●researcherOptimizing raw visual input as a soft-token mechanism is an architecturally distinct PEFT approach with implications for understanding how visual tokens influence frozen model behavior.