[arXiv]score: 0.13
A Definition of Good Explanations and the Challenges Explaining LLM Outputs
June 16, 2026
Good explanations, defined here as counterfactual statements weighted by the interlocutor's prior beliefs, create a structural problem for LLM explainability: because LLM outputs depend on billions of parameters with no discrete causal chain, satisfying both the counterfactual and belief-updating criteria simultaneously is intractable. The paper argues this isn't a tooling gap but a fundamental mismatch between how LLMs generate outputs and what explanations require.