Library of the Week — Smolagents

Smolagents — Hugging Face’s minimal framework for building tool-calling agents in pure Python

GitHub · Language: Python · License: Apache 2.0

What it does

Smolagents is Hugging Face’s lightweight agent framework that lets LLMs write and execute Python code to solve tasks, rather than relying on rigid JSON tool-calling schemas. It’s aimed at developers who want capable agents without the abstraction overhead of LangChain or LlamaIndex. The core loop is intentionally simple: model thinks, writes code, code runs, repeat.

Why it stands out

Code agents over JSON agents — instead of parsing structured tool calls, the model writes actual Python that gets executed in a sandboxed interpreter. This makes multi-step reasoning dramatically more composable and less brittle.
Tiny surface area — the core library is a few hundred lines. You can read and understand the entire execution loop in an afternoon, which matters when you’re debugging agents in production.
First-class support for local and hosted models — works with Hugging Face Inference API, local Transformers models, and has adapters for OpenAI-compatible endpoints (GPT-5.5, Claude Opus 4.8, etc.) with minimal config changes.
Built-in tool library — ships with ready-made tools for web search, code execution, image generation, and file I/O, so you’re not building from scratch.

Quick start

from smolagents import CodeAgent, DuckDuckGoSearchTool, HfApiModel

model = HfApiModel(model_id="Qwen/Qwen3-32B")

agent = CodeAgent(
    tools=[DuckDuckGoSearchTool()],
    model=model,
)

result = agent.run(
    "Find the latest benchmark scores for Devstral 2 and summarize them."
)
print(result)

When to use it

You want agents that can do real multi-step computation (data wrangling, API calls, math) and JSON tool schemas keep breaking on complex tasks.
You’re already in the Hugging Face ecosystem and want something that integrates naturally with Transformers and the Hub.
You need a framework you can actually read and modify — not one where debugging requires spelunking through 40 layers of abstraction.

When to skip it

Code execution in your environment is a security concern and sandboxing adds too much ops overhead — the code-first model is the whole point, so if you can’t run it safely, you lose the core value proposition.
You need production-grade observability, retry logic, and state persistence out of the box; smolagents is still young here and you’ll be stitching those pieces together yourself.

A note on sandboxing

The code-first design means the local executor is the attack surface, and it has shipped real CVEs (code injection and SSRF disclosed in early-to-mid 2026). Don’t run smolagents against untrusted inputs in the default LocalPythonExecutor — use one of the sandboxed execution backends the project now provides (E2B, Modal, Docker, or WebAssembly). Treat the local executor as a dev-loop convenience, not a production posture.

The verdict

Smolagents is the most honest implementation of the “LLMs are good at writing code” insight, and the minimal design makes it genuinely auditable. It’s not the right choice if you need a batteries-included platform, but for developers who want a clean foundation to build on without fighting their framework, it’s hard to beat right now.