[HUGGINGFACE]score: 0.47

VitaBench 2.0 Benchmarks LLM Agents on Long-Term Personalization and Proactive Behavior

May 25, 2026

VitaBench 2.0 evaluates agents on temporally ordered, per-user task sequences requiring inference of unstated preferences and proactive interaction, filling a gap left by benchmarks focused solely on reasoning and tool use.

paper

HOW THIS AFFECTS YOU

●

builderYou can use VitaBench 2.0 to measure whether your agent actually learns and acts on user preferences over time, not just within a single session.

●

researcherThe temporally ordered per-user task structure provides a more realistic evaluation axis for personalization and proactivity than existing static agent benchmarks.

SOURCE

https://huggingface.co/papers/2605.27141

← back to feed