Series

Paper of the Week

2 editions

  1. Paper of the Week — TokenPacker: Efficient Visual Projector with Group-Conditioned Dot-Product Attention for Multimodal Large Language Models...

    KV cache compression that cuts memory 40–60% with under 1% accuracy loss — here's the technique your inference stack probably isn't using yet.

  2. Paper of the Week — Training Language Models to Self-Correct via Reinforcement Learning

    SCoRe trains a single LLM to catch and fix its own mistakes via RL — 15.6% better on math, 9.1% on code, no multi-model pipeline needed.