Beneficial RL Training on Health Data Improves Cross-Domain Alignment | HACKOBAR_