[NEWSLETTER]score: 0.45

Gzip as a Language Model: Compression Efficiency Scores Text Predictions

June 18, 2026

Text continuation can be scored by measuring how efficiently gzip compresses candidate outputs appended to a prompt, effectively using compression ratio as a probability proxy. The approach has no learned parameters and serves as a baseline for understanding what statistical patterns LLMs capture.

HOW THIS AFFECTS YOU

●

researcherThe compression-as-LM framing offers a parameter-free baseline useful for isolating what neural LLMs learn beyond raw statistical redundancy.

read original ↗nathan.rs

← back to feed