[r/MachineLearning]score: 0.05
Built a normalizer so WER stops penalizing formatting differences in STT evals! [P]
April 23, 2026
**Gladia releases `gladia-normalization`, an open-source Python library for text normalization in STT evaluation pipelines.** The library applies configurable, YAML-defined normalization steps to both hypothesis and reference transcripts before WER computation, resolving surface-form mismatches like "$50" vs "fifty dollars" or "3:00PM" vs "3 pm" that inflate error rates without reflecting actual recognition failures. Currently supporting English, French, and German, it addresses a reproducibility gap in STT benchmarking where ad-hoc normalization scripts produce inconsistent, non-comparable WER scores across projects — relevant to any team running systematic evaluations of ASR systems like Whisper, Azure Speech, or similar engines.
project