VeriBound provides PAC-Bayesian generalization bounds for PRMs trained on formal verification labels (Z3, Isabelle), explaining the cross-task generalization observed in FOVER. The framework delivers four results: generalization bounds, sample complexity estimates, convergence rates, and Best-of-K performance guarantees for verification-trained PRMs.
HOW THIS AFFECTS YOU
●
researcherGives theoretical grounding to the empirical generalization behavior of verification-trained PRMs, with bounds on sample complexity and Best-of-K performance you can use to justify or tune PRM training pipelines.