Paper of the Week — When Poison Fails After Retrieval: Revisiting Corpus Poisoning under Chunking and Reranking Pipelines
RAG corpus poisoning drops 60-80% when you add chunking+reranking — but most attack evals skip these standard pipeline stages entirely.
When Poison Fails After Retrieval: Revisiting Corpus Poisoning under Chunking and Reranking Pipelines
Xi Nie, Hongwei Li, Shenghao Wu, Mingxuan Li, Jiachen Li, Wenbo Jiang. Published 2026-06-09. arXiv:2606.11265
One sentence summary
Corpus poisoning attacks against RAG systems lose 60–80% of their effectiveness when evaluated against realistic pipelines that include document chunking and reranking — but almost no prior work tests this.
Why this paper
Most teams building production RAG systems use chunking and a reranker as a matter of course. The security literature evaluating poisoning attacks mostly ignores both. This creates a false picture of how vulnerable your actual stack is.
What they did
The authors systematically re-evaluated four established corpus poisoning attacks (including HotFlip and PoisonedRAG variants) under two conditions: simplified retrieval (whole-document, no reranker) vs. realistic pipelines (chunked documents, reranker applied). They measured attack success rate — whether the injected adversarial passage actually surfaced in the final context sent to the LLM — across both settings.
Key findings
- Chunking alone reduces attack success rate by 40–60% because adversarial perturbations are diluted across chunks; the poisoned signal that wins at the document level often loses at the chunk level
- Reranking adds another 20–40% reduction on top of chunking, since rerankers use cross-encoder semantics that adversarial token-level perturbations don’t fool as reliably
- Combined chunking + reranking drops most attacks below 15% success rate, versus 70–90% in the simplified setting
- The gap is largest for gradient-based attacks (HotFlip-style) which optimize against bi-encoder similarity — rerankers use a different scoring function the attack never sees
- Query-agnostic “universal” poison documents are hit hardest; query-specific attacks retain slightly more effectiveness but still degrade substantially
Why it matters for practitioners
If you’ve been worried about corpus poisoning based on published attack numbers, those numbers probably don’t reflect your real pipeline. A standard chunking + reranking setup provides meaningful defense without any extra security tooling — and this paper gives you the empirical grounding to understand why.
Conversely, if you’re doing a red-team or threat model for a RAG system, you should be generating poisoned chunks, not poisoned documents, and testing against your actual reranker — not a cosine-similarity baseline.
What you can use today
- Audit your current RAG pipeline: if you’re not chunking and reranking, you’re more exposed than the teams that are — adding a cross-encoder reranker (e.g., a
cross-encoder/ms-marcomodel viasentence-transformers) is the highest-leverage hardening step - When evaluating third-party corpus poisoning tools or red-team results, confirm whether they tested against chunked retrieval and a reranker; if not, discount the reported attack success rates substantially
- For high-stakes RAG deployments, use this paper’s methodology as a checklist: test your specific chunk size, overlap, and reranker combination rather than assuming published results transfer to your stack