← Home 12 editions

Paper of the Week

Series

June 2026

JUN 11 Paper of the Week — When Poison Fails After Retrieval: Revisiting Corpus Poisoning under Chunking and Reranking Pipelines
RAG corpus poisoning drops 60-80% when you add chunking+reranking — but most attack evals skip these standard pipeline stages entirely.
JUN 4 Paper of the Week — Caught in the Act(ivation): Toward Pre-Output and Multi-Turn Detection of Credential Exfiltration by LLM Agents
Activation probes detect credential exfiltration *before* the LLM outputs any tokens — combined with honeytokens and multi-turn leakage tracking, with no model changes required.

May 2026

April 2026

March 2026

MAR 26 Paper of the Week — Training Language Models to Self-Correct via Reinforcement Learning
SCoRe trains a single LLM to catch and fix its own mistakes via RL — 15.6% better on math, 9.1% on code, no multi-model pipeline needed.