[arXiv]score: 0.37
Learning with Rare Success but Rich Feedback via Reflection-Enhanced Self-Distillation
May 14, 2026
Proposes Reflection-Enhanced Self-Distillation (RESD) framework enabling LLMs to learn from failure feedback in rare-success regimes by transforming environmental feedback into active learning signals rather than passive conditioning.
cs.LG