[arXiv]score: 0.22

Gradient-based Epistemic Goggles to Mitigate Negation Neglect

July 3, 2026

Goggles is a learned module that intervenes on LoRA gradients during supervised finetuning to induce specific epistemic frames. It addresses Negation Neglect, where models fail to identify fictional claims as such, achieving better stance identification than traditional data annotation methods.

HOW THIS AFFECTS YOU

●

researcherYou can use this method to control how models perceive the truthfulness of training data without manual labeling.

read original ↗arxiv.org

DAILY DIGEST

catch up on AI in 2 minutes, every morning. free. unsubscribe anytime. privacy

← back to feed