PAC-Bayesian Bounds Formalize Why Verification-Trained PRMs Generalize | HACKOBAR_