Library of the Week — Outlines

Outlines — Structured text generation with guaranteed schema compliance

GitHub · Stars: ~12k · Language: Python · License: Apache 2.0

What it does

Outlines makes LLM outputs conform to exact schemas — JSON, regex patterns, context-free grammars, or Python types — without post-hoc parsing hacks. It works by manipulating the token sampling process directly, masking invalid tokens at each step so the model cannot produce malformed output. It’s aimed at developers who need reliable structured data from local or API-based models.

Why it stands out

Guarantees, not hopes: Unlike prompt engineering or retry loops, Outlines constrains the logits at inference time — a JSON object is structurally impossible to corrupt mid-generation
Pydantic-native: Pass a BaseModel subclass and Outlines derives the grammar automatically; no separate schema authoring step
Backend agnostic: Works with transformers, llama.cpp, vllm, and mlx — same API regardless of whether you’re on a GPU server or an M-series Mac
Regex and CFG support: Beyond JSON, you can enforce phone number formats, SQL dialects, or any custom grammar, which few competitors match

Quick start

from pydantic import BaseModel
from outlines import models, generate

class Product(BaseModel):
    name: str
    price: float
    in_stock: bool

model = models.transformers("mistralai/Mistral-Large-2")
generator = generate.json(model, Product)

result = generator("Extract product info: 'Widget Pro costs $29.99 and is available.'")
print(result)
# Product(name='Widget Pro', price=29.99, in_stock=True)

When to use it

You’re building extraction or classification pipelines where downstream code breaks on malformed JSON — retries aren’t acceptable
You need structured output from a locally-hosted model that doesn’t have native function-calling support
You’re enforcing domain-specific formats (dates, identifiers, DSLs) that go beyond what JSON schema alone can express

When to skip it

You’re exclusively using hosted frontier APIs like GPT-5.4 or Claude Sonnet 4.6 that already offer robust native structured output — the overhead of running a local model may not be worth it
Latency is extremely tight; constrained decoding adds measurable overhead per token compared to unconstrained sampling

The verdict

Outlines solves a real, persistent pain point — flaky structured output — at the right level of abstraction (the sampler, not the prompt). If any part of your stack involves local model inference and structured data extraction, it should be your first stop before reaching for fragile JSON-repair utilities. The Pydantic integration in particular makes it feel like a natural extension of a modern Python stack rather than a bolted-on tool.