[HUGGINGFACE]score: 0.42

GENEB: Why Genomic Models Are Hard to Compare

June 2, 2026

Frozen representations from 40 genomic foundation models evaluated across 100 tasks in 13 functional categories reveal that aggregate leaderboard rankings are unstable — model rankings shift sharply across task categories, and scale yields only modest, inconsistent gains. GENEB uses a unified probing protocol including few-shot regimes to enable controlled comparisons across architecture, tokenization, and pretraining data choices.

read original ↗huggingface.co

← back to feed