[arXiv]score: 0.41
WriteSAE: Sparse Autoencoders for Recurrent State
May 14, 2026
WriteSAE introduces the first sparse autoencoder targeting matrix cache writes in state-space and hybrid recurrent LLMs like Mamba-2, Gated DeltaNet, and RWKV-7, where residual-stream SAEs cannot reach. It factors decoder atoms into native rank-1 write shapes with a closed-form per-token logit shift and Frobenius-norm training. Critical for mechanistic interpretability researchers working beyond transformer architectures.
cs.LGcs.AIcs.CL