[HUGGINGFACE]score: 0.47
SimuWoB: 120-Task Synthetic Mobile GUI Benchmark Covers Real-App Complexity and Long-Horizon Tasks
May 23, 2026
SimuWoB generates 120 high-fidelity synthetic mobile GUI tasks spanning diverse real-world app types and difficulty levels, addressing the gap between reproducible benchmarks limited to open-source apps and actual production app complexity.
paper
HOW THIS AFFECTS YOU
●
builderYou can evaluate mobile GUI agents on complex, long-horizon tasks that better reflect real app interactions without needing live app access.
●
researcherThe synthetic environment generation framework enables scalable, reward-verified evaluation of GUI agents on tasks previously inaccessible due to closed-source app constraints.