Skill-3D Builds Scene-Aware Tool-Use Memory for MLLM 3D Spatial Reasoning Agents
June 5, 2026
Skill-3D addresses uniform tool-use bias in MLLM agents doing 3D spatial reasoning by recording successful tool trajectories into a Scene Memory, then distilling them into reusable scene-specific skills. The framework evolves its skill library based on scene type, improving over both non-agentic baselines and fixed-strategy agents.
HOW THIS AFFECTS YOU
●
researcherSkill-3D provides a concrete mechanism for adaptive tool selection in 3D reasoning agents, with empirical evidence that scene-heterogeneity is a key failure mode for uniform agentic strategies.