[HUGGINGFACE]score: 0.55
Bootstrap Mode Frequency Best Calibrates Activation Oracle Confidence (ECE 5.7% vs 25.5%)
May 24, 2026
Across 6,000 samples per oracle on Qwen3-8B and Qwen3.6-27B, bootstrap mode frequency achieves ECE of 5.7% versus 25.5% for log-probability baselines, making it the best-calibrated uncertainty quantification method for activation oracle outputs.
paper
HOW THIS AFFECTS YOU
●
researcherBootstrap mode frequency should replace log-probability as the default confidence estimator for activation oracles, with log-prob retained only as a fast triage signal.