[arXiv]score: 0.49
PowerFlow Uses Alpha-Power Distributions to Elicit Reasoning or Diversity from LLMs
May 26, 2026
PowerFlow reformulates unsupervised LLM fine-tuning as distribution matching using GFlowNet with a length-aware Trajectory-Balance objective, enabling directional control over reasoning sharpness vs. output diversity via α-power distributions.
cs.CLcs.AIcs.LG
HOW THIS AFFECTS YOU
●
researcherThe α-power distribution target provides a principled knob for trading off reasoning intensification vs. creative diversity without external supervision, addressing length bias in autoregressive generation.