[HUGGINGFACE]score: 0.42
ToolSense: A Diagnostic Framework for Auditing Parametric Tool Knowledge in LLMs
June 3, 2026
ToolSense audits whether LLMs genuinely understand parametric tool knowledge versus pattern-matching on constrained decoding paths. The framework targets a gap in ToolBench-style evaluations, which use verbose queries and restrict outputs to valid token paths, masking shallow memorization. It runs LLM-powered diagnostics against models trained with two-stage memorization-then-retrieval SFT fine-tuning.