[HUGGINGFACE]score: 0.36

28 LLMs Mimic Human Risk Caution Superficially but Diverge on Mechanism in St. Petersburg Tests

June 2, 2026

Evaluating 28 LLMs on the St. Petersburg paradox and controlled variants finds most produce finite bids resembling human caution, but the underlying decision mechanisms diverge from human reasoning when prompts perturb truncation, repeated play, or identity framing. Instruction-tuned models do not consistently outperform base models on mechanism alignment.

paper

HOW THIS AFFECTS YOU

●

researcherThe St. Petersburg testbed isolates outcome-level mimicry from mechanism-level alignment, providing a reusable probe for distinguishing surface-level behavioral agreement from genuine decision-process similarity.

●

policyThis finding complicates using behavioral benchmarks alone to assess LLM alignment with human values in high-stakes decision contexts.

SOURCE

https://huggingface.co/papers/2606.04978

← back to feed