[HUGGINGFACE]score: 0.48
3DCodeBench Tests 12 VLMs on Procedural 3D Code Generation
May 30, 2026
3DCodeBench evaluates 12 VLMs on their ability to translate text and image references into procedural 3D modeling code, targeting deterministic, engine-ready asset generation that neural 3D generators cannot reliably produce. The benchmark pairs automated metrics with 3DCodeArena, a human evaluation arena, to better capture perceptual 3D shape quality.
paper
HOW THIS AFFECTS YOU
●
builderIf you're building 3D content pipelines, this benchmark surfaces which VLMs are most capable at generating usable procedural geometry code today.
●
researcherProvides a structured evaluation framework for comparing VLM agents on geometric code reasoning, with human eval to complement automated metrics.