[arXiv]score: 0.19
PatchBoard Multi-Agent System Hits 84.6% on ALFWorld vs 30.8% for LangGraph
May 29, 2026
PatchBoard replaces inter-agent natural-language dialogue with validated JSON Patch mutations over a shared schema-grounded state, enforced by a deterministic kernel checking role contracts and runtime invariants before each commit. On 630 ALFWorld episodes it achieves 84.6% success at 45.5k tokens per task versus LangGraph's 30.8% at 368.3k tokens.
cs.CL
HOW THIS AFFECTS YOU
●
builderThe 2.7x success rate improvement and 8x token reduction over LangGraph on ALFWorld make PatchBoard's schema-grounded state mutation approach worth evaluating for production multi-agent pipelines.
●
researcherThe transactional kernel with role-specific write contracts is a concrete architecture for studying coordination reliability in multi-agent systems.