[arXiv]score: 0.85

Trajel Benchmark Finds Most Agent Hallucinations Occur in Intermediate Steps, Not Final Outputs

May 26, 2026

Trajel introduces a five-type trajectory hallucination taxonomy over expert-annotated multi-agent industrial traces, finding that nearly half of hallucinated trajectories involve multiple types simultaneously and that high-accuracy binary detectors still misclassify subtle failure modes.

cs.AI

HOW THIS AFFECTS YOU

●

builderYou should audit intermediate Thought-Action-Observation steps in your agent pipelines, not just final outputs, since current detectors miss the most common failure modes.

●

researcherThe five-type taxonomy (factual, referential, logical, procedural, scope-based) and subtask-level annotation provide a more granular evaluation framework than existing final-output hallucination benchmarks.

●

policyThe finding that high binary accuracy masks misclassification of subtle hallucination types has direct implications for AI auditing standards in industrial agentic deployments.

SOURCE

https://arxiv.org/abs/2605.24219

← back to feed