●builderYou should evaluate your coding agents on their ability to handle ambiguity and iterative feedback, not just single-shot tasks.
●researcherThis provides a more realistic evaluation metric for the next generation of autonomous software engineers.