HACKOBAR_item
[r/LocalLLaMA]score: 0.18

Follow-up: Trying to make NVIDIA GPUs plug-and-play on Macs. Found hidden RDMA symbols Apple doesn't want you to see — zero-copy GPU memory sharing might already work.

May 6, 2026
A hobbyist reverse-engineering effort on a 4-node Mac cluster (3x M3 Ultra plus M5 Max, 1.5TB unified memory) uncovered undocumented ibv_reg_dmabuf_mr symbols in Apple's libibverbs, suggesting GPUDirect-style zero-copy RDMA using Metal GPU buffers may be achievable on macOS without kernel patches. NVIDIA Blackwell detection succeeds over TB5 but GSP firmware boot fails, a known gap being debugged with tinygrad. If confirmed, this bypasses traditional CPU-staged memory copies in distributed inference, directly relevant to anyone running multi-node LLM workloads on Apple Silicon. No equivalent open documentation exists from Apple, making this a significant undocumented capability worth community validation.
discussion