[arXiv]score: 0.24
Sparse Autoencoder Decomposition of Clinical Sequence Model Representations: Feature Complexity, Task Specialisation, and Mortality Prediction
May 7, 2026
Researchers applied TopK Sparse Autoencoders to FlatASCEND, a 14.5M-parameter EHR foundation model, extracting features across all 10 residual stream layers on MIMIC-IV and INSPECT datasets. SAE decomposition reveals clear progressive abstraction: layer-0 features are 45.7% singleton token detectors, while layer-6 features span roughly 30 token types. SAE features outperform dense representations for discrete mortality prediction but underperform for continuous length-of-stay regression, suggesting task-specific representational geometry. Clinical AI teams building interpretable EHR models should prioritize this mechanistic lens, as it advances beyond black-box probing toward circuit-level understanding of transformer depth in healthcare sequences.
cs.LGcs.AIcs.CL