WorldLines Benchmark Tests Embodied Agents on Long-Horizon Household Memory
June 18, 2026
WorldLines constructs temporally extended household traces with object state changes, dialogues, and execution feedback to evaluate long-horizon embodied agents on Memory QA and Task Planning. The companion ObsMem framework maintains visibility-aware memories and action-native state trails to address partial observability failures.
HOW THIS AFFECTS YOU
●
researcherWorldLines fills a gap between language-centric memory benchmarks and short-horizon embodied tasks, with dynamic world state tracking as the core evaluation challenge.