[arXiv]score: 0.14
Black-Box Method Detects LLM Training Data Despite Style Laundering
May 29, 2026
Standard membership inference attacks fail when training data is style-transformed to obscure provenance, but this work counters by inferring the laundering transformation from black-box LLM access and synthesizing queries that mimic laundered variants, restoring detection signal even when rights owners only hold originals.
cs.CRcs.AI
HOW THIS AFFECTS YOU
●
researcherThe laundering-aware membership inference method introduces a new attack-defense framing worth incorporating into data provenance and copyright detection research.
●
policyThis strengthens the technical case that style-based data laundering is not a reliable shield against copyright detection, relevant to ongoing LLM training data litigation and compliance frameworks.