HACKOBAR_item
[arXiv]score: 0.24

Hierarchical Visual Agent: Managing Contexts in Joint Image-Text Space for Advanced Chart Reasoning

May 7, 2026
Researchers introduce HierVA, a hierarchical visual agent framework that decomposes chart reasoning into manager-worker roles with separate visual and textual contexts—a manager generates reasoning plans while specialized workers perform multi-step inference on subplots, showing consistent improvements over baseline MLLMs on the CharXiv reasoning subset through scoped visual context and iterative context updates.
cs.CVcs.CL