BenSyc Benchmark Tests Sycophancy in Bengali LLM Conversations Across 15+ Models
June 10, 2026
BenSyc is a human-validated benchmark built from 11,840 Reddit posts and 170k comments covering Bengali social contexts, with a five-level taxonomy from Invalidation to Escalation. Evaluation of 15+ open and proprietary LLMs reveals gaps in distinguishing empathetic support from reinforcing sycophancy in non-English culturally grounded dialogue.
HOW THIS AFFECTS YOU
●
researcherProvides a concrete evaluation framework and dataset for sycophancy beyond English factual settings, useful for benchmarking multilingual alignment.