[arXiv]score: 0.11

wav2vec2.0 Needs Supervised Fine-Tuning to Learn Phonological Compensation

June 17, 2026

Testing wav2vec2.0 on Mandarin tone compensation shows no evidence of phonological context sensitivity from self-supervised pretraining alone; probing classifiers reveal partial compensation only after ASR fine-tuning. This contradicts prior claims that phonological structure emerges from pretraining and suggests supervised objectives are necessary for at least some phonological abstractions.

HOW THIS AFFECTS YOU

●

researcherDirectly challenges assumptions about what self-supervised speech models learn, with implications for how you design training pipelines for tonal or phonologically complex languages.

read original ↗arxiv.org

← back to feed