HACKOBAR_item
[arXiv]score: 0.24

Perturbation is All You Need for Extrapolating Language Models

May 7, 2026
Researchers from arXiv (2605.04344) propose replacing standard autoregressive next-token prediction with a perturbation-based training framework that conditions on semantically perturbed prefixes rather than exact ones, creating a hierarchical pre-post-additive noise structure. The method delivers measurable out-of-support generalization gains while preserving in-distribution performance, backed by formal extrapolability theory. LLM pretraining teams and researchers tackling distribution shift should take note, as this challenges the foundational assumption that exact-prefix conditioning is optimal, offering a theoretically grounded alternative to standard cross-entropy training with no architectural overhead.
stat.MLcs.LGmath.STstat.TH