[arXiv]score: 0.44
A Unified Framework for Critical Scaling of Inverse Temperature in Self-Attention
May 14, 2026
Provides unified theory for length-dependent logit rescaling in self-attention, resolving conflicting inverse-temperature laws by deriving critical scale from gap-counting function N_n of attention rows, determining how to scale inverse temperature with context length n.
stat.MLcs.LGmath.PR