[HN]score: 0.26

GateGPT: 56k tokens per second Transformer (KV cache) on FPGA at 80 MHz

June 16, 2026

A fully digital Transformer with KV cache implemented gate-by-gate as a custom integrated circuit, prototyped on FPGA, achieves 56,000+ tokens per second at 80 MHz with no CPU or GPU involvement. The design runs Karpathy's microGPT architecture entirely in silicon. No model size or parameter count is specified in the current disclosure.

read original ↗twitter.com

← back to feed