[arXiv]score: 0.41
Enabling Performant and Flexible Model-Internal Observability for LLM Inference
May 13, 2026
DMI-Lib decouples model-internal observability from inference hot path via asynchronous GPU-CPU memory abstraction (Ring²) and policy-controlled host backend for capturing internal LLM states.
cs.LGcs.AIcs.PFcs.SEcs.SYeess.SY