[r/MachineLearning]score: 0.12

DVD-JEPA: Minimal reproducible JEPA world model in 32-dimensional latent space

June 20, 2026

DVD-JEPA trains a context encoder, EMA target encoder, and latent predictor on a 16x16 bouncing DVD logo environment to predict future states in a 32-dimensional representation space — no labels, no decoder, no pixel prediction. It serves as a minimal reproducible proof-of-concept for LeCun's JEPA architecture, demonstrating learned world-model structure in a controlled setting.

HOW THIS AFFECTS YOU

●

researcherUseful as a clean, reproducible baseline for studying JEPA dynamics without confounds from complex environments or large-scale compute.

read original ↗reddit.com

← back to feed