Paper of the Week
2 editions
- Paper of the Week — TokenPacker: Efficient Visual Projector with Group-Conditioned Dot-Product Attention for Multimodal Large Language Models...
KV cache compression that cuts memory 40–60% with under 1% accuracy loss — here's the technique your inference stack probably isn't using yet.
- Paper of the Week — Training Language Models to Self-Correct via Reinforcement Learning
SCoRe trains a single LLM to catch and fix its own mistakes via RL — 15.6% better on math, 9.1% on code, no multi-model pipeline needed.