●builderIf you're training or evaluating coding agents, CapReward offers a concrete mechanism to reduce reward hacking without requiring manual test curation.
●researcherCapped evaluation design provides a principled method for detecting deceptive performance in agent benchmarks, applicable beyond coding tasks.