[HUGGINGFACE]score: 0.80
Zero-CoT Probe Detects Evasive Benchmark Contamination in LLMs
May 20, 2026
Suppressing chain-of-thought generation (Zero-CoT Probe) exposes memorization hidden by reasoning steps, enabling black-box detection of paraphrase-based benchmark contamination that evades existing methods.
paper
HOW THIS AFFECTS YOU
●
researcherYour benchmark evaluations may be compromised by evasive contamination; ZCP gives you a new black-box tool to audit models without needing training data access.
●
policyWorth watching because leaderboard manipulation via paraphrased benchmark data is harder to detect than direct contamination, and this method provides a concrete detection mechanism.