[arXiv]score: 0.24
Democratizing the medieval English legal tradition
May 5, 2026
Researchers released an open-source HTR pipeline for medieval Latin legal manuscripts, training R-BLLa for line segmentation and CNN+LSTM with CTC decoding for handwriting recognition on 4,029 lines across 193 Anglo-American legal cases. Baseline word accuracy hits 79%, with post-processing pushing it higher despite the tiny training corpus. Historians, legal scholars, and digital humanities teams should take note as this unlocks millions of previously inaccessible pages foundational to common law, outperforming manual transcription bottlenecks where fewer than dozens of specialists worldwide hold relevant paleographic expertise.
cs.CVcs.AIcs.CL