Tag

arxiv

13 posts

Jun 18, 2026 Paper of the Week

Paper of the Week — Scaling Enterprise Agent Routing: Degradation, Diagnosis, and Recovery

Routing accuracy for 110+ agents drops sharply past 50 tools — new study maps the degradation curve and identifies three recovery strategies that work today.

research papers arxiv practical-ai
Jun 11, 2026 Paper of the Week

Paper of the Week — When Poison Fails After Retrieval: Revisiting Corpus Poisoning under Chunking and Reranking Pipelines

RAG corpus poisoning drops 60-80% when you add chunking+reranking — but most attack evals skip these standard pipeline stages entirely.

research papers arxiv practical-ai
Jun 4, 2026 Paper of the Week

Paper of the Week — Caught in the Act(ivation): Toward Pre-Output and Multi-Turn Detection of Credential Exfiltration by LLM Agents

Activation probes detect credential exfiltration *before* the LLM outputs any tokens — combined with honeytokens and multi-turn leakage tracking, with no model changes required.

research papers arxiv practical-ai
May 28, 2026 Paper of the Week

Paper of the Week — Exploring the Emerging Threats of the Agent Skill Ecosystem

76 malicious skills confirmed in 3,984 audited AI agent marketplaces — credential theft, backdoor installation, and data exfiltration found hiding in plain sight.

research papers arxiv practical-ai
May 21, 2026 Paper of the Week

Paper of the Week — Does Code Cleanliness Affect Coding Agents? A Controlled Minimal-Pair Study

Code cleanliness measurably changes how well coding agents complete tasks — a controlled minimal-pair study with released dataset and reproducible methodology.

research papers arxiv practical-ai
May 14, 2026 Paper of the Week

Paper of the Week — Useful Memories Become Faulty When Continuously Updated by LLMs

LLM agent memory degrades when consolidated: episodic traces outperform summarized lessons across 5 agentic tasks.

research papers arxiv practical-ai
May 7, 2026 Paper of the Week

Paper of the Week — TSCG: Deterministic Tool-Schema Compilation for Agentic LLM Deployments

TSCG shows small LLMs (4–14B) drop tool-call failures by compiling JSON schemas into natural-language descriptions before inference.

research papers arxiv practical-ai
Apr 30, 2026 Paper of the Week

Paper of the Week — Learning to Rewrite Tool Descriptions for Reliable LLM-Agent Tool Use

Rewriting tool descriptions at deployment time—not training time—can recover 20-40% of function-calling accuracy lost to poorly written API docs.

research papers arxiv practical-ai
Apr 23, 2026 Paper of the Week

Paper of the Week — Beyond One Output: Visualizing and Comparing Distributions of Language Model Generations

Visualizing LLM output distributions reveals hidden modes, edge cases, and prompt sensitivity that single-sample evaluation completely misses.

research papers arxiv practical-ai
Apr 17, 2026 Paper of the Week

Paper of the Week — Lossless Prompt Compression via Dictionary-Encoding and In-Context Learning: Enabling Cost-Effective LLM Analysis of Repetitive Data

Lossless prompt compression via dictionary encoding lets LLMs analyze repeated data at a fraction of token cost — no external tools, just in-context learning.

research papers arxiv practical-ai
Apr 9, 2026 Paper of the Week

Paper of the Week — Mamba-Based State Space Models for Long-Context Retrieval-Augmented Generation

Structured state-space models finally beat transformers at document retrieval — here's what the Mamba-based RAG benchmark actually shows.

research papers arxiv practical-ai
Apr 2, 2026 Paper of the Week

Paper of the Week — SnapKV: LLM Knows What You are Looking for Before Generation

KV cache compression that cuts memory 40–60% with under 1% accuracy loss — here's the technique your inference stack probably isn't using yet.

research papers arxiv practical-ai
Mar 26, 2026 Paper of the Week

Paper of the Week — Training Language Models to Self-Correct via Reinforcement Learning

SCoRe trains a single LLM to catch and fix its own mistakes via RL — 15.6% better on math, 9.1% on code, no multi-model pipeline needed.

research papers arxiv practical-ai

Paper of the Week — Scaling Enterprise Agent Routing: Degradation, Diagnosis, and Recovery

Paper of the Week — When Poison Fails After Retrieval: Revisiting Corpus Poisoning under Chunking and Reranking Pipelines

Paper of the Week — Caught in the Act(ivation): Toward Pre-Output and Multi-Turn Detection of Credential Exfiltration by LLM Agents

Paper of the Week — Exploring the Emerging Threats of the Agent Skill Ecosystem

Paper of the Week — Does Code Cleanliness Affect Coding Agents? A Controlled Minimal-Pair Study

Paper of the Week — Useful Memories Become Faulty When Continuously Updated by LLMs

Paper of the Week — TSCG: Deterministic Tool-Schema Compilation for Agentic LLM Deployments

Paper of the Week — Learning to Rewrite Tool Descriptions for Reliable LLM-Agent Tool Use

Paper of the Week — Beyond One Output: Visualizing and Comparing Distributions of Language Model Generations

Paper of the Week — Lossless Prompt Compression via Dictionary-Encoding and In-Context Learning: Enabling Cost-Effective LLM Analysis of Repetitive Data

Paper of the Week — Mamba-Based State Space Models for Long-Context Retrieval-Augmented Generation

Paper of the Week — SnapKV: LLM Knows What You are Looking for Before Generation

Paper of the Week — Training Language Models to Self-Correct via Reinforcement Learning