τ-Rec Benchmark Replaces LLM-as-Judge for Agentic Recommender Evals | HACKOBAR_