[arXiv]score: 0.35

AGOP as Explanation: From Feature Learning to Per-Sample Attribution in Image Classifiers

May 14, 2026

AGOP-Weighted repurposes the Average Gradient Outer Product, a feature-learning quantity from the Neural Feature Ansatz, as a per-sample attribution method by weighting gradients with a training-distribution prior via sqrt(diag(M)/max diag(M)). This bridges feature learning theory and explainability for image classifiers. Practitioners gain a theoretically grounded saliency method without additional forward passes.

cs.LG

SOURCE

https://arxiv.org/abs/2605.12816

← back to feed