[arXiv]score: 0.52
A-R Space Framework Profiles LLM Agent Execution vs. Refusal Behavior
May 26, 2026
A two-dimensional Action Rate / Refusal Signal framework measures how tool-using LLM agents redistribute execution and refusal behaviors across four normative regimes and three autonomy scaffolds, revealing structural gaps between linguistic safety signals and actual execution.
cs.AIcs.SE
HOW THIS AFFECTS YOU
●
researcherYou can use the A-R/Divergence metrics as a more granular alternative to aggregate safety scores when evaluating agentic LLMs under varying autonomy configurations.
●
policyWorth watching because it surfaces how refusal signals and actual execution diverge under malicious or gray-area prompting regimes, which has direct implications for deployment governance of tool-using agents.