[HUGGINGFACE]score: 0.48

MyPCBench: 184-Task Personal Computer-Use Benchmark with Login-Required Web Apps

June 14, 2026

MyPCBench evaluates computer-use agents as personal assistants on a Linux desktop with 17 simulated web apps seeded with a canonical persona, covering 184 tasks that require logged-in accounts and personal context — a gap live-web benchmarks cannot test.

HOW THIS AFFECTS YOU

●

builderIf you're building desktop or browser agents, this benchmark tests the authenticated, personalized task space your users actually care about.

●

researcherCloses the evaluation gap between impersonal sandboxes and real personal assistant deployments by including authenticated, context-dependent web tasks.

read original ↗huggingface.co

← back to feed