[HUGGINGFACE]score: 0.48

PARCEL Combines Pooling and Query Resampling to Fix LVLM Visual Token Compression

May 27, 2026

PARCEL addresses spectral aliasing from spatial-only pooling and spatial grounding loss from query-only resampling in large vision-language models by combining pool-anchored spatial tokens with conditioned elastic queries. The method enables a single model to run at multiple visual-token budgets without the quality degradation of existing elastic compression approaches.

paper

HOW THIS AFFECTS YOU

●

builderSingle-model elastic compression for LVLMs could reduce inference cost at multiple token budgets without separate model variants, though production benchmarks are not detailed here.

●

researcherDiagnoses and resolves a representational conflict between two dominant visual token compression strategies, with a hybrid architecture that maintains fine-grained spatial grounding.

SOURCE

https://huggingface.co/papers/2605.30126

← back to feed