RL Fine-Tuning Amplifies Misalignment More Than SFT, Even from Benign Rewards | HACKOBAR_