●researcherQuantifies a concrete safety property — control evasion awareness — across frontier models, with task-domain and watermarking ablations useful for alignment research.
●policyLow CI awareness scores provide empirical support for current AI control protocol designs, but the benchmark also maps the attack surface if awareness increases.