[arXiv]score: 0.11

GraphARC Benchmark Exposes Comprehension-Execution Gap in LLMs on Graph Transformations

June 1, 2026

GraphARC extends the ARC few-shot transformation paradigm to graph-structured data, covering local, global, and hierarchical transformations across diverse graph families. State-of-the-art LLMs can answer property questions about graphs but consistently fail to execute full transformation tasks, and performance degrades with graph size.

cs.AI

HOW THIS AFFECTS YOU

●

researcherGraphARC isolates a specific failure mode — models understanding graph structure but failing to apply transformation rules — that is worth targeting if you work on relational or structured reasoning.

SOURCE

https://arxiv.org/abs/2605.31031

← back to feed