Structured Output¶
Enforce response schemas on agent outputs with automatic validation, retry, and provider-native extraction.
Quick Start¶
Pass any supported schema type to returns= on agent.run():
from pydantic import BaseModel, Field
from cogent import Agent
class ContactInfo(BaseModel):
name: str = Field(description="Full name")
email: str = Field(description="Email address")
phone: str | None = Field(None, description="Phone number")
agent = Agent(name="Extractor", model="gpt-5.4")
result = await agent.run(
"Extract: John Doe, john@acme.com, 555-1234",
returns=ContactInfo,
)
contact = result.content.data # ContactInfo instance
print(contact.name) # "John Doe"
print(contact.email) # "john@acme.com"
result.content is a StructuredResult with:
.data— the validated instance (or parsed value for bare types).valid— whether validation succeeded.error— validation error message if.validisFalse.attempts— number of attempts (1 unless retry was needed).raw— raw LLM output (wheninclude_raw=True)
Supported Schema Types¶
| Type | Example | Returns |
|---|---|---|
| Pydantic model | returns=ContactInfo |
ContactInfo instance |
| Dataclass | returns=MeetingAction |
Dataclass instance |
| TypedDict | returns=Address |
dict matching TypedDict |
| JSON Schema | returns={"type": "object", ...} |
dict |
| Bare primitives | returns=int, returns=str |
int, str value directly |
| Literal | returns=Literal["yes", "no"] |
Constrained str |
| list | returns=list[str] |
list[str] |
| set | returns=set[str] |
set[str] (unique items) |
| tuple | returns=tuple[str, int] |
tuple with typed positions |
| Union | returns=Success \| Failure |
Whichever variant matches |
| Enum | returns=Priority |
Enum member |
| None | returns=type(None) |
None (confirmation) |
| dict | returns=dict |
Agent-decided structure |
Field Guidance¶
Pydantic Field metadata flows directly into the JSON schema that the model sees. Richer metadata produces better output.
Descriptions¶
class BugReport(BaseModel):
title: str = Field(description="Short, specific title (not 'bug' or 'error')")
severity: str = Field(description="One of: critical, high, medium, low")
steps: list[str] = Field(description="Numbered reproduction steps")
Per-Field Examples¶
Field(examples=[...]) adds example values to the JSON schema per property:
class CommitMessage(BaseModel):
type: str = Field(
description="Commit type",
examples=["feat", "fix", "refactor", "docs", "chore"],
)
scope: str | None = Field(
None,
description="Affected module",
examples=["auth", "api", "cli"],
)
description: str = Field(
description="Imperative mood summary, max 72 chars",
max_length=72,
examples=["add password reset endpoint", "fix null pointer in user lookup"],
)
Constraints¶
ge, le, min_length, max_length, and pattern constrain values:
class MovieReview(BaseModel):
rating: float = Field(ge=0.0, le=10.0, description="Rating out of 10")
summary: str = Field(min_length=50, max_length=300, description="One-paragraph review")
tags: list[str] = Field(min_length=1, max_length=5, description="Genre tags")
pattern enforces regex constraints on strings — the model sees the pattern in the schema and formats accordingly, and Pydantic validates on parse with automatic retry on mismatch:
class ContactCard(BaseModel):
phone: str = Field(pattern=r"^\d{3}-\d{3}-\d{4}$", description="US phone")
zip_code: str = Field(pattern=r"^\d{5}(-\d{4})?$", description="US ZIP")
result = await agent.run(
"Jane Smith — call her at (555) 867 5309, zip 9 0 2 1 0",
returns=ContactCard,
)
# phone: "555-867-5309" (reformatted to match pattern)
# zip_code: "90210"
Few-Shot Examples¶
Pass examples= to agent.run() to inject few-shot examples into the prompt. The model sees concrete output samples alongside the JSON schema, improving adherence significantly. Accepts model instances (auto-serialised via model_dump()) or plain dicts:
class ReleaseNote(BaseModel):
category: str = Field(description="One of: added, changed, fixed, removed, deprecated")
title: str = Field(description="Short imperative summary")
description: str = Field(description="One-sentence elaboration")
breaking: bool = Field(default=False, description="Whether this is a breaking change")
result = await agent.run(
"We added automatic retry on validation failure.",
returns=ReleaseNote,
examples=[
ReleaseNote(
category="added",
title="Streaming support for agent responses",
description="Agents can now stream via stream=True.",
breaking=False,
),
ReleaseNote(
category="changed",
title="Renamed output parameter to returns",
description="The output= parameter on run() is now returns=.",
breaking=True,
),
],
)
Model instances give full IDE autocomplete and Pydantic validation at definition time. Dicts work too when you prefer brevity:
result = await agent.run(
"We added automatic retry on validation failure.",
returns=ReleaseNote,
examples=[
{"category": "added", "title": "Streaming support", "description": "...", "breaking": False},
],
)
The Pydantic-native ConfigDict(json_schema_extra={"examples": [...]}) approach also works — Cogent falls back to it when run(examples=) is not provided.
All Field metadata — descriptions, examples, constraints — are included in the JSON schema via model_json_schema(). Cogent injects this schema into the prompt automatically. No additional configuration required.
Schema Types in Detail¶
Pydantic Models¶
The recommended approach. Full validation, Field metadata, and nested model support:
class ContactInfo(BaseModel):
name: str = Field(description="Full name")
email: str = Field(description="Email address")
phone: str | None = Field(None, description="Phone number")
result = await agent.run("Extract: John Doe, john@acme.com", returns=ContactInfo)
print(result.content.data.name) # "John Doe"
Dataclasses¶
Standard Python dataclasses work without changes:
from dataclasses import dataclass
@dataclass
class MeetingAction:
task: str
assignee: str
priority: str
due_date: str | None = None
result = await agent.run("Extract action: ...", returns=MeetingAction)
TypedDict¶
from typing import TypedDict
class Address(TypedDict):
street: str
city: str
country: str
result = await agent.run("Parse: 10 Downing St, London, UK", returns=Address)
Bare Primitives¶
Return values directly without wrapping in a model:
result = await agent.run("How many words: Hello world test", returns=int)
print(result.content.data) # 3
result = await agent.run("Summarize in one word", returns=str)
print(result.content.data) # "Programming"
result = await agent.run("Is the sky blue?", returns=bool)
print(result.content.data) # True
Literal Types¶
Constrained string choices:
from typing import Literal
result = await agent.run(
"Review this implementation",
returns=Literal["PROCEED", "REVISE"],
)
print(result.content.data) # "PROCEED"
Collections¶
Lists, sets, and tuples work as bare types:
result = await agent.run("Extract tags", returns=list[str])
print(result.content.data) # ["python", "async", "fastapi"]
result = await agent.run("Unique categories", returns=set[str])
print(result.content.data) # {"ai", "python", "llm"}
result = await agent.run("Player: Sarah, 25, 95.5", returns=tuple[str, int, float])
print(result.content.data) # ("Sarah", 25, 95.5)
Partial validation for list[T]¶
When extracting many items, some may fail validation while others are fine. Cogent validates every item instead of stopping at the first failure. Valid items are kept across retry attempts, and the model is asked to fix only the broken ones:
from pydantic import BaseModel, Field
class Employee(BaseModel):
name: str
title: str
years: int = Field(ge=0)
result = await agent.run(
"Extract all employees from this document",
returns=list[Employee],
)
# If 48 of 50 items pass and 2 fail, the retry prompt shows only the 2
# failed items with their specific validation errors. The model fixes
# just those items — the 48 valid ones carry forward untouched.
This avoids the cost of regenerating an entire list when only a few items need correction.
Tip
For complex collection schemas, wrap in a Pydantic model for reliability:
Union Types¶
The agent chooses which variant to return based on the content:
from typing import Union
class Success(BaseModel):
status: Literal["success"] = "success"
result: str
class Error(BaseModel):
status: Literal["error"] = "error"
message: str
result = await agent.run("Handle payment", returns=Success | Error)
# Returns Success or Error based on content
Enum Types¶
from enum import Enum
class Priority(str, Enum):
LOW = "low"
MEDIUM = "medium"
HIGH = "high"
CRITICAL = "critical"
result = await agent.run("Production server down!", returns=Priority)
print(result.content.data) # Priority.CRITICAL
Dynamic Structure¶
Let the agent decide the output fields:
result = await agent.run("Analyze user feedback", returns=dict)
print(result.content.data) # {"sentiment": "positive", "score": 8, ...}
None Type¶
For actions that just need confirmation:
result = await agent.run("Delete temp files", returns=type(None))
print(result.content.data is None) # True
JSON Schema¶
Raw JSON Schema dicts for dynamic or programmatic schemas:
event_schema = {
"type": "object",
"properties": {
"title": {"type": "string"},
"date": {"type": "string", "description": "YYYY-MM-DD"},
"attendees": {"type": "array", "items": {"type": "string"}},
},
"required": ["title", "date"],
}
result = await agent.run("Parse: Team lunch tomorrow at noon", returns=event_schema)
ResponseSchema — Advanced Configuration¶
For fine-grained control over extraction method, retry behavior, and raw output access, use ResponseSchema:
from cogent.agent.output import OutputMethod, ResponseSchema
config = ResponseSchema(
schema=ProductReview,
method=OutputMethod.AUTO, # AUTO, NATIVE, TOOL, or PROMPT
retry_on_error=True, # Self-correct on validation failure
max_retries=2, # Up to 2 correction attempts
include_raw=True, # Keep raw LLM output for debugging
)
result = await agent.run("Parse this review: ...", returns=config)
print(result.content.attempts) # Number of attempts needed
print(result.content.raw) # Raw LLM output (if include_raw=True)
Output Methods¶
| Method | Behavior |
|---|---|
AUTO |
Cogent selects the best method for the provider (default) |
NATIVE |
Use provider-native structured output (OpenAI json_schema, Gemini response_schema) |
TOOL |
Extract via tool call — schema is presented as a tool the model must call |
PROMPT |
Prompt engineering — schema injected into the system prompt |
Self-Correction Retry¶
When validation fails, cogent appends a correction turn showing the model exactly what it produced and why it failed. The model self-corrects conversationally:
Call 1 Human: <task + schema> → AI: <bad JSON>
Call 2 Human: validation error + output → AI: <corrected JSON> ✅
This is not blind re-prompting — the model sees its own mistake and the validation error.
For list[T] schemas, the correction is targeted: only the failed items (with their index and specific error) are shown. Valid items carry forward across retry attempts, so the model fixes a small patch instead of regenerating the entire list.
Exhaustion Behavior¶
By default, when all retry attempts are exhausted, result.content is a StructuredResult with valid=False and data=None. The caller must check .valid before accessing .data.
Set on_exhaustion="raise" on ResilienceConfig to get an explicit OutputValidationError instead:
from cogent.agent.resilience import ResilienceConfig
agent = Agent(
name="Extractor",
model=model,
resilience=ResilienceConfig(
on_exhaustion="raise", # Raise OutputValidationError on exhaustion
),
)
result = await agent.run("Parse this review: ...", returns=ProductReview)
# If validation fails after retries, OutputValidationError is raised
Use fallback_model to escalate to a stronger model before giving up:
agent = Agent(
name="Extractor",
model="gpt-5.4-nano",
resilience=ResilienceConfig(
fallback_model="gpt-5.4", # one extra attempt with a larger model
on_exhaustion="raise",
),
)
| Value | Behavior |
|---|---|
"return" |
Return StructuredResult(valid=False) — caller checks .valid (default) |
"raise" |
Raise OutputValidationError — explicit failure |
Subagent Structured Output¶
Use returns= on the Agent constructor to declare what a subagent produces. The coordinator receives clean JSON instead of a string representation.
class ReviewScore(BaseModel):
score: int
verdict: Literal["approved", "needs_revision"]
feedback: str
reviewer = Agent(
name="reviewer",
model="gpt-5.4-mini",
description="Review copy for quality and compliance",
returns=ReviewScore,
instructions="Review content. Score 1-10. Be concise.",
)
editor = Agent(
name="editor",
model="gpt-5.4-mini",
subagents=[reviewer],
instructions="Have the reviewer check the copy.",
)
result = await editor.run("Review this tweet")
# Coordinator LLM sees: {"score": 8, "verdict": "approved", "feedback": "..."}
How it works:
returns=ReviewScoreis set on the subagent- When the coordinator delegates,
agent.run(task, returns=ReviewScore)is called automatically - The executor serializes
StructuredResult.dataas JSON for the parent LLM
This works identically for remote A2A subagents:
from cogent import Agent
remote_reviewer = Agent(
url="http://review-svc/a2a",
name="reviewer",
description="Review copy for quality and compliance",
returns=ReviewScore,
)
# Or override per-call
response = await remote_reviewer.run("Review this tweet", returns=ReviewScore)
print(response.content.data.score) # 8
| Situation | Use returns=? |
|---|---|
| Subagent produces a well-defined data structure | Yes |
| Parent needs to branch on subagent output | Yes |
| Subagent writes freeform text (articles, copy) | No — plain string is fine |
| Multiple coordinators reuse the same subagent | Yes — schema defined once |
Note
returns= on the constructor only affects behavior when the agent is called as a subagent. A standalone agent.run() call without an explicit returns= kwarg ignores it.
Low-Level Model API¶
For direct model usage without the agent layer, use with_structured_output():
from cogent.models.openai import OpenAIChat
llm = OpenAIChat(model="gpt-5.4").with_structured_output(Person)
response = await llm.ainvoke([
{"role": "user", "content": "Extract: John Doe is 30 years old"}
])
data = json.loads(response.content) # {"name": "John Doe", "age": 30}
Provider Support¶
| Provider | Method | Strict Mode |
|---|---|---|
| OpenAI | json_schema |
✅ |
| Anthropic | Tool-based | ✅ |
| Gemini | response_schema |
✅ |
| Groq | json_mode |
❌ |
| xAI | json_schema |
✅ |
| DeepSeek | deepseek-chat only |
❌ |
| Ollama | json_mode |
❌ |
Methods¶
# json_schema (default — strict typing)
llm.with_structured_output(Person, method="json_schema")
# json_mode (less strict, more compatible)
llm.with_structured_output(Person, method="json_mode")
With Tools¶
Structured output and tools can be combined:
from cogent import tool
@tool
def get_weather(location: str) -> str:
"""Get weather for a location."""
return f"Sunny in {location}"
llm = OpenAIChat(model="gpt-5.4")
llm = llm.bind_tools([get_weather])
llm = llm.with_structured_output(Person)
Examples¶
See examples/structured_output/ for runnable examples:
schema_types.py— all 11 schema typesfield_guidance.py—Field(description=...),Field(examples=[...]), numeric/length constraints,Field(pattern=...)regexfew_shot_examples.py— few-shot examples viarun(examples=)advanced_patterns.py—ResponseSchema, nested models, schema reuselist_extraction.py— extracting many items with partial validation and targeted retry