VETO Benchmark Quantifies Misfired Alignment Where Models Reject Warranted Conclusions | HACKOBAR_