Sequential LoRA fine-tuning of LLaMA-3.1-8B beats 70B baseline on essay scoring
June 10, 2026
Sequential curriculum fine-tuning of LLaMA-3.1-8B with LoRA 4-bit on discourse elements (lead→position→claim→evidence→conclusion) achieves F1 of 65% on evidence and 87% on conclusion on PERSUADE 2.0, outperforming both independent task models and a general-purpose LLaMA-70B baseline on conclusion.
HOW THIS AFFECTS YOU
●
researcherThe result that task-ordering in sequential fine-tuning outperforms a 9x larger model is a useful data point for curriculum design in structured prediction tasks.