[HUGGINGFACE]score: 0.42

Skill-3D Builds Scene-Aware Tool-Use Memory for MLLM 3D Spatial Reasoning Agents

June 5, 2026

Skill-3D addresses uniform tool-use bias in MLLM agents doing 3D spatial reasoning by recording successful tool trajectories into a Scene Memory, then distilling them into reusable scene-specific skills. The framework evolves its skill library based on scene type, improving over both non-agentic baselines and fixed-strategy agents.

HOW THIS AFFECTS YOU

●

researcherSkill-3D provides a concrete mechanism for adaptive tool selection in 3D reasoning agents, with empirical evidence that scene-heterogeneity is a key failure mode for uniform agentic strategies.

read original ↗huggingface.co

← back to feed