●builderIf you use faithfulness metrics to evaluate RAG or grounded generation pipelines, your eval may be rewarding abstention — this benchmark offers a more complete evaluation framework.
●researcherThe completeness property of F1 telemetry fills a gap absent in open-domain benchmarks, enabling exact recall measurement alongside precision for grounded generation evaluation.