[arXiv]score: 0.09

LLM Value Annotation Varies by Model; Prompt Calibration Reduces Misattribution

June 10, 2026

Different LLMs produce systematically different annotations of human values in social media text under Schwartz's theory, with iterative prompt calibration via error analysis improving structural alignment and reducing spurious attributions. Evaluation goes beyond F1 to include confidence-ambiguity relations and annotation stability across non-English posts.

HOW THIS AFFECTS YOU

●

researcherIf you use LLMs for subjective construct annotation, model choice and prompt calibration materially affect structural validity — not just surface accuracy metrics.

read original ↗arxiv.org

← back to feed