[arXiv]score: 0.13
Scaling Laws for Task-Specific LLM Distillation
June 24, 2026
Empirical scaling laws for compressing LLMs via iterative structural pruning show that general-knowledge benchmarks collapse at lower compression ratios than in-domain task quality, with supervision format being the primary driver of that gap. Tested in quantitative finance, a blended chain-of-thought KL-divergence loss over reasoning traces recovers general knowledge that standard logit-based or LoRA distillation loses. The laws quantify degradation curves across dataset size, compression ratio, and pruning schedule.