●builderIf you're building mobile agents, PhoneHarness provides a verifiable evaluation harness that tests real workflow completion across action types — more representative than GUI-only benchmarks.
●researcherThe mixed-action MDP with auditable side effects is a cleaner evaluation setup for studying when agents should switch between GUI, CLI, and tool modalities.