PlanBench-XL Tests LLM Agents on Long-Horizon Tool Planning | HACKOBAR_