[arXiv]score: 0.81
Aligned LLMs Predict Norms, Not Human Behavior — 10:1 Gap vs Base Models
May 27, 2026
Across 120 base-aligned model pairs evaluated on 10,000+ real human decisions in strategic games, base models outperform aligned models at predicting actual human choices by nearly 10:1, while aligned models dominate on one-shot textbook games, indicating alignment instills normative rather than descriptive behavioral priors.
cs.CLcs.AIcs.GT
HOW THIS AFFECTS YOU
●
researcherYou need to account for this normative bias when using aligned LLMs as proxies for human behavior in simulation or evaluation pipelines.
●
policyWorth watching because alignment objectives may systematically diverge from modeling real human decision-making, with implications for how we interpret and audit model behavior in social contexts.