[HUGGINGFACE]score: 0.55
LLMEval-Logic: A Solver-Verified Chinese Benchmark for Logical Reasoning of LLMs with Adversarial Hardening
May 18, 2026
LLMEval-Logic is a Chinese logical reasoning benchmark using solver-verified formal annotations and adversarial hardening to resist saturation by frontier models. Addresses known weaknesses in template-generated benchmarks with coarse annotations. Relevant for teams evaluating reasoning models on non-English tasks.
paper