CombEval Benchmark Exposes LLM Failures on Combinatorial Counting | HACKOBAR_