← Home 12 editions

Paper of the Week

Series

June 2026

  1. Paper of the Week — When Poison Fails After Retrieval: Revisiting Corpus Poisoning under Chunking and Reranking Pipelines

    RAG corpus poisoning drops 60-80% when you add chunking+reranking — but most attack evals skip these standard pipeline stages entirely.

  2. Paper of the Week — Caught in the Act(ivation): Toward Pre-Output and Multi-Turn Detection of Credential Exfiltration by LLM Agents

    Activation probes detect credential exfiltration *before* the LLM outputs any tokens — combined with honeytokens and multi-turn leakage tracking, with no model changes required.

May 2026

  1. Paper of the Week — Exploring the Emerging Threats of the Agent Skill Ecosystem

    76 malicious skills confirmed in 3,984 audited AI agent marketplaces — credential theft, backdoor installation, and data exfiltration found hiding in plain sight.

  2. Paper of the Week — Does Code Cleanliness Affect Coding Agents? A Controlled Minimal-Pair Study

    Code cleanliness measurably changes how well coding agents complete tasks — a controlled minimal-pair study with released dataset and reproducible methodology.

  3. Paper of the Week — Useful Memories Become Faulty When Continuously Updated by LLMs

    LLM agent memory degrades when consolidated: episodic traces outperform summarized lessons across 5 agentic tasks.

  4. Paper of the Week — TSCG: Deterministic Tool-Schema Compilation for Agentic LLM Deployments

    TSCG shows small LLMs (4–14B) drop tool-call failures by compiling JSON schemas into natural-language descriptions before inference.

April 2026

  1. Paper of the Week — Learning to Rewrite Tool Descriptions for Reliable LLM-Agent Tool Use

    Rewriting tool descriptions at deployment time—not training time—can recover 20-40% of function-calling accuracy lost to poorly written API docs.

  2. Paper of the Week — Beyond One Output: Visualizing and Comparing Distributions of Language Model Generations

    Visualizing LLM output distributions reveals hidden modes, edge cases, and prompt sensitivity that single-sample evaluation completely misses.

  3. Paper of the Week — Lossless Prompt Compression via Dictionary-Encoding and In-Context Learning: Enabling Cost-Effective LLM Analysis of Repetitive Data

    Lossless prompt compression via dictionary encoding lets LLMs analyze repeated data at a fraction of token cost — no external tools, just in-context learning.

  4. Paper of the Week — Mamba-Based State Space Models for Long-Context Retrieval-Augmented Generation

    Structured state-space models finally beat transformers at document retrieval — here's what the Mamba-based RAG benchmark actually shows.

  5. Paper of the Week — SnapKV: LLM Knows What You are Looking for Before Generation

    KV cache compression that cuts memory 40–60% with under 1% accuracy loss — here's the technique your inference stack probably isn't using yet.

March 2026

  1. Paper of the Week — Training Language Models to Self-Correct via Reinforcement Learning

    SCoRe trains a single LLM to catch and fix its own mistakes via RL — 15.6% better on math, 9.1% on code, no multi-model pipeline needed.