[arXiv]score: 0.44

Spurious Correlation Learning in Preference Optimization: Mechanisms, Consequences, and Mitigation via Tie Training

May 13, 2026

Analysis of spurious correlation learning in preference optimization methods like DPO, characterizing mechanisms of sycophancy and length bias with a provable mitigation strategy via tie training.

cs.LGcs.AI

SOURCE

https://arxiv.org/abs/2605.11134

← back to feed