[arXiv]score: 0.15
Eight LLMs Tested as Social Engineering Attackers Across English and Chinese
May 29, 2026
An LLM-to-LLM red-teaming framework evaluates eight models on multi-turn social engineering scenarios in English and Chinese, finding that adversarial dialogues follow recurrent escalation patterns that single-turn safety evals miss entirely. Statistically significant cross-model and cross-lingual differences in attack success rates suggest current safety benchmarks underestimate real-world conversational manipulation risk.
cs.CL
HOW THIS AFFECTS YOU
●
researcherThe annotated attacker/defender strategy taxonomy and transition analysis provide a structured methodology for evaluating multi-turn adversarial robustness beyond standard benchmarks.
●
policyCross-lingual gaps in defensive performance across eight production-class models indicate that safety evaluations limited to English systematically underestimate manipulation vulnerability in other languages.