Building Custom Tools¶
Create production-ready tools for cogent agents.
Overview¶
Tools are functions that agents can call to interact with the world — APIs, databases, files, web searches, and more. Cogent provides two ways to add tools:
- @tool decorator — Create custom tools from any function (this guide)
- Toolkits — Use pre-built tool classes like WebSearch, FileSystem, CodeSandbox
Toolkits are classes that provide tools. They handle setup, state management, and expose multiple related tools:
from cogent import Agent
from cogent.toolkits import WebSearch, FileSystem, CodeSandbox
agent = Agent(
model="gpt-5.4-mini",
tools=[
WebSearch(), # Provides: search, fetch_url
FileSystem(allowed_paths=["./data"]), # Provides: read, write, list
CodeSandbox(), # Provides: execute_python
],
)
See Toolkits for 12+ production-ready toolkits — KnowledgeGraph, Browser, PDF, Shell, MCP, and more.
For custom logic, use the @tool decorator:
Quick Start¶
from cogent import tool
@tool
def get_weather(city: str) -> str:
"""Get current weather for a city.
Args:
city: The city name to get weather for.
Returns:
Weather information as a string.
"""
# Your implementation
return f"Weather in {city}: 72°F, sunny"
That's it! The @tool decorator automatically:
- Extracts function signature → JSON schema
- Parses docstring → tool description
- Handles sync/async execution
- Provides error handling
The @tool Decorator¶
from cogent import tool
@tool
def my_tool(param1: str, param2: int = 10) -> str:
"""Tool description goes here.
Args:
param1: Description of param1.
param2: Description of param2 (optional).
Returns:
Description of return value.
"""
return f"Result: {param1} x {param2}"
What Gets Generated¶
{
"type": "function",
"function": {
"name": "my_tool",
"description": "Tool description goes here.",
"parameters": {
"type": "object",
"properties": {
"param1": {
"type": "string",
"description": "Description of param1."
},
"param2": {
"type": "integer",
"description": "Description of param2 (optional).",
"default": 10
}
},
"required": ["param1"]
}
}
}
Return Type Information¶
The @tool decorator automatically extracts return type information and includes it in the tool description. This helps the LLM understand what output to expect:
@tool
def get_weather(city: str) -> dict[str, int]:
"""Get weather data for a city.
Args:
city: City name to query.
Returns:
A dictionary with temp, humidity, and wind_speed.
"""
return {"temp": 75, "humidity": 45, "wind_speed": 10}
# LLM sees this description:
# "Get weather data for a city. Returns: dict[str, int] - A dictionary with temp, humidity, and wind_speed."
# Access the return info directly:
print(get_weather.return_info)
# Output: "dict[str, int] - A dictionary with temp, humidity, and wind_speed."
What gets extracted:
| Source | Example | Result |
|---|---|---|
| Return type annotation | -> str |
"str" |
| Generic types | -> dict[str, int] |
"dict[str, int]" |
| Optional types | -> str \| None |
"str \| None" |
| Docstring Returns section | Returns: The result. |
"The result." |
| Both combined | Type + docstring | "dict[str, int] - A dictionary with..." |
[!TIP] Always include a
Returns:section in your docstrings to give the LLM context about the output format.
Decorator Options¶
@tool(
name="web_search", # Override function name
description="Search the web", # Override docstring
return_direct=True, # Return result directly to user
cache=True, # Enable semantic caching (requires agent.cache)
)
def search(query: str, max_results: int = 10) -> str:
"""Search implementation."""
return f"Found {max_results} results for: {query}"
| Option | Type | Default | Description |
|---|---|---|---|
name |
str |
Function name | Override the tool name |
description |
str |
Docstring | Override the tool description |
return_direct |
bool |
False |
Return result directly to user without LLM processing |
cache |
bool |
False |
Enable automatic semantic caching (see Semantic Caching) |
Async Tools¶
Most production tools are async (API calls, database queries, file I/O):
from cogent import tool
import httpx
@tool
async def fetch_url(url: str) -> str:
"""Fetch content from a URL.
Args:
url: The URL to fetch.
Returns:
The response text.
"""
async with httpx.AsyncClient() as client:
response = await client.get(url)
return response.text
The executor handles both sync and async tools automatically.
Type Hints¶
Use type hints for automatic schema generation:
from cogent import tool
from typing import Literal
@tool
def search(
query: str,
engine: Literal["google", "bing", "duckduckgo"] = "duckduckgo",
max_results: int = 10,
) -> list[dict[str, str]]:
"""Search the web.
Args:
query: Search query string.
engine: Search engine to use.
max_results: Maximum number of results.
Returns:
List of search results with title and URL.
"""
# Implementation
return [{"title": "Result 1", "url": "https://..."}]
Supported types:
- str, int, float, bool
- list[T], dict[K, V]
- Literal["option1", "option2"]
- Optional[T] or T | None
- Pydantic models
Error Handling¶
Always handle errors gracefully:
from cogent import tool
import httpx
@tool
async def safe_fetch(url: str) -> str:
"""Safely fetch URL with error handling."""
try:
async with httpx.AsyncClient(timeout=10.0) as client:
response = await client.get(url)
response.raise_for_status()
return response.text
except httpx.TimeoutException:
return f"Error: Request to {url} timed out after 10 seconds"
except httpx.HTTPStatusError as e:
return f"Error: HTTP {e.response.status_code} from {url}"
except Exception as e:
return f"Error fetching {url}: {str(e)}"
Best practices: - Catch specific exceptions first - Provide helpful error messages - Return error as string (don't raise in tools) - Log errors for debugging
Tool Composition¶
Combine related tools into a toolkit class:
from cogent import tool
class Calculator:
"""Calculator with memory."""
def __init__(self):
self.memory: float = 0.0
@tool
def add(self, a: float, b: float) -> float:
"""Add two numbers."""
result = a + b
self.memory = result
return result
@tool
def recall(self) -> float:
"""Recall last result from memory."""
return self.memory
Pattern: Related tools in a class share state and configuration. This is exactly how toolkits work — see Toolkits.
Context Injection¶
Access agent context in tools:
from cogent import tool
from cogent.core.context import RunContext
@tool
async def get_user_data(
user_id: int,
ctx: RunContext, # Auto-injected
) -> dict:
"""Get user data with context."""
# Access agent context
session_id = ctx.session_id
user = ctx.user_id
# Use context for logging, auth, etc
logger.info(f"User {user} requesting data for {user_id} in session {session_id}")
return await fetch_user(user_id)
Context fields:
- session_id — Current session identifier
- user_id — User making the request
- metadata — Custom metadata dict
- agent — Reference to the agent instance
Registering Tools with Agents¶
from cogent import Agent
from cogent.toolkits import WebSearch, FileSystem
# Custom tools + toolkits together
agent = Agent(
model="gpt-5.4-mini",
tools=[get_weather, my_custom_tool, WebSearch(), FileSystem()], # Custom @tool functions + toolkits
)
All tools (custom functions and toolkits) are available to the agent.
Testing Tools¶
Use pytest for tool testing:
import pytest
from cogent import tool
@tool
async def divide(a: float, b: float) -> float:
"""Divide two numbers."""
if b == 0:
raise ValueError("Cannot divide by zero")
return a / b
@pytest.mark.asyncio
async def test_divide_success():
result = await divide(10, 2)
assert result == 5.0
@pytest.mark.asyncio
async def test_divide_by_zero():
with pytest.raises(ValueError, match="Cannot divide by zero"):
await divide(10, 0)
Tool Patterns¶
Pattern 1: Retry with Exponential Backoff¶
import asyncio
from cogent import tool
@tool
async def resilient_api_call(url: str, max_retries: int = 3) -> str:
"""API call with retries."""
for attempt in range(max_retries):
try:
return await fetch(url)
except Exception as e:
if attempt == max_retries - 1:
return f"Failed after {max_retries} attempts: {e}"
await asyncio.sleep(2 ** attempt)
Pattern 2: Rate Limiting¶
import asyncio
from cogent import tool
class RateLimitedAPI:
def __init__(self, calls_per_second: int = 10):
self.delay = 1.0 / calls_per_second
self.last_call = 0.0
@tool
async def call_api(self, endpoint: str) -> str:
"""Call API with rate limiting."""
now = asyncio.get_event_loop().time()
wait_time = max(0, self.delay - (now - self.last_call))
if wait_time > 0:
await asyncio.sleep(wait_time)
self.last_call = asyncio.get_event_loop().time()
return await self._make_request(endpoint)
Pattern 3: Caching¶
Semantic Caching (Recommended)¶
Use @tool(cache=True) for automatic semantic caching. Similar queries return cached results:
from cogent import Agent, tool
@tool(cache=True)
async def search_products(query: str) -> str:
"""Search products in the catalog.
Args:
query: Search query for products.
Returns:
Product search results.
"""
# Expensive API call - cached semantically
return await product_api.search(query)
# Agent must have cache enabled
agent = Agent(
model="gpt-5.4-mini",
tools=[search_products],
cache=True, # Required for @tool(cache=True)
)
# First call executes the tool
await agent.run("Find running shoes")
# Similar query hits cache (semantic match)
await agent.run("Show me running sneakers") # Cache hit!
How it works:
- Tool input is embedded using the agent's embedding model
- Cache checks for semantically similar previous calls
- If similarity exceeds threshold, cached result is returned
- Otherwise, tool executes and result is stored
Requirements:
- Agent must have
cache=Trueenabled - An embedding model must be configured (or uses default)
- Tool must have
cache=Truein decorator
Simple LRU Caching¶
For exact-match caching (same input = same output):
from cogent import tool
from functools import lru_cache
@tool
@lru_cache(maxsize=100)
def cached_computation(input: str) -> str:
"""Expensive computation with exact-match caching."""
# Cached result for identical input only
return expensive_operation(input)
[!TIP] Use
@tool(cache=True)for semantic similarity matching,@lru_cachefor exact string matching.
Pattern 4: Validation¶
from cogent import tool
from pydantic import BaseModel, Field
class SearchParams(BaseModel):
query: str = Field(min_length=1, max_length=500)
max_results: int = Field(ge=1, le=100)
@tool
async def validated_search(params: SearchParams) -> list:
"""Search with validated parameters."""
# Pydantic ensures constraints
return await search(params.query, params.max_results)
Tool Execution¶
When an agent has multiple tools, they execute in parallel by default via NativeExecutor:
from cogent import Agent
agent = Agent(
model="gpt-5.4-mini",
tools=[fetch_weather, fetch_news, fetch_stock]
)
# If LLM requests multiple tools in one turn, they run concurrently
result = await agent.run("Get weather, news, and stock data")
# 3 tools × 0.5s each = ~0.5s total (parallel via asyncio.gather)
Execution behavior: - Parallel: When LLM requests multiple tools in one turn - Sequential: When LLM naturally calls tools one at a time across turns - LLM decides: Based on task requirements and your prompt
Configuration:
from cogent.executors import NativeExecutor
executor = NativeExecutor(
agent,
max_tool_calls_per_turn=50, # Max tools per LLM response
max_concurrent_tools=20, # Tune for external API rate limits
resilience=True # Auto-retry on LLM rate limits
)
Standalone Execution¶
For quick tasks without creating an Agent:
from cogent.executors import run
result = await run(
"Search for Python tutorials",
tools=[search],
model="gpt-5.4-mini",
)
Suspend / Resume¶
When a tool triggers a process that takes minutes, hours, or days — human
approval, a batch pipeline, an external webhook — the tool can suspend
the agent run. The application later calls agent.resume() with the real
result.
If the delay is short enough to poll for, pass a check callback and a
timeout — the executor polls automatically, and the tool completes
without suspending at all.
Raising Suspend¶
from cogent import Suspend, tool
@tool
async def request_approval(document: str) -> str:
"""Submit a document for manager approval."""
ticket_id = await ticketing.create(document)
raise Suspend(
callback_id=ticket_id,
message=f"Approval request {ticket_id} sent to manager.",
)
callback_id is an application-defined string (job ID, ticket number, etc.)
that lets you match the resume call to the original suspension.
message is the interim text returned to the LLM so it can acknowledge
the pause before the run ends.
Auto-Resolve with check¶
When the result might arrive within seconds, add a check callback:
@tool
async def generate_report(topic: str) -> str:
"""Generate a report — finishes within seconds."""
job_id = await reports.submit(topic)
raise Suspend(
callback_id=job_id,
message=f"Report {job_id} generation started.",
check=lambda: reports.get_result(job_id), # () -> str | None
timeout=30.0, # poll for up to 30 seconds
interval=2.0, # check every 2 seconds
)
The executor calls check() every interval seconds. If it returns a
non-None string before the deadline, the tool completes normally — no
suspension, no external resume needed. If the timeout expires first, the
run suspends as usual.
Pass timeout=None to poll indefinitely — the tool waits as long as it
takes for check() to return a result. Use this when the external
service always has a polling / status endpoint:
@tool
async def request_approval(document: str) -> str:
"""Submit for approval — polls until approved."""
ticket_id = await approval_api.submit(document)
raise Suspend(
callback_id=ticket_id,
message=f"Approval {ticket_id} submitted.",
check=lambda: approval_api.get_status(ticket_id),
timeout=None, # wait indefinitely
interval=60.0, # check every minute
)
check can be sync or async. Other parallel tools in the same batch
run concurrently during polling.
Detecting Suspension¶
agent.run() returns a Response with suspended=True and the
callback_id:
resp = await agent.run("Submit the Q3 budget for approval", thread_id="t1")
if resp.suspended:
# Save callback_id, register a webhook, start a timer, etc.
save_pending(resp.callback_id, thread_id="t1")
Resuming¶
When the external process completes, feed the result back:
final = await agent.resume(
thread_id="t1",
callback_id=resp.callback_id,
result="Approved with minor edits.",
)
print(final.content) # Agent's follow-up using the real result
resume() requires conversation memory so the agent can recall the
original context. It accepts the same optional parameters as run()
(context, max_iterations, reasoning, returns).
Observability¶
Suspend emits a tool.suspended event with callback_id and message
fields. Auto-resolve polling emits a tool.waiting event before polling
starts. The console formatter renders both with status indicators.
When to Use Each Pattern¶
| Scenario | Approach |
|---|---|
| Finishes in seconds (in-process) | await inside the tool |
| Finishes in seconds (external job) | Suspend with check + timeout |
| Finishes in minutes (same run) | Submit + check two-tool pattern |
| Finishes in hours/days, or needs human action | Suspend / resume() |
Tool Artifacts¶
Tools sometimes produce rich data — images, DataFrames, raw bytes — that
the application needs but the LLM does not. Tool artifacts let a tool
return sideband data that is kept off the LLM context and collected on
response.artifacts.
Returning an Artifact¶
Return a ToolResult instead of a plain string:
from cogent import tool, ToolResult
@tool
def generate_chart(topic: str) -> ToolResult:
"""Render a PNG chart.
Args:
topic: Chart subject.
"""
png_bytes = render(topic) # your rendering logic
return ToolResult(
content=f"Chart for '{topic}' rendered (PNG, {len(png_bytes)} bytes).",
artifact=png_bytes,
name="chart",
mime_type="image/png",
)
| Field | Purpose |
|---|---|
content |
Text the LLM sees (keep it short and descriptive) |
artifact |
Any Python object — bytes, dict, list, dataclass, etc. Never sent to the LLM |
name |
Label for the artifact (defaults to the tool name) |
mime_type |
Optional MIME type hint for downstream consumers |
Consuming Artifacts¶
After an agent run, artifacts are available on the response:
result = await agent.run("Generate a sales chart.")
for art in result.artifacts:
print(art.name, art.mime_type, type(art.data))
if art.mime_type == "image/png":
Path("chart.png").write_bytes(art.data)
Each Artifact on the response carries:
| Field | Description |
|---|---|
data |
The original object returned by the tool |
name |
Artifact name |
mime_type |
MIME type (if provided) |
tool_name |
Name of the tool that produced it |
tool_call_id |
The LLM tool-call ID that triggered the tool |
When to Use Artifacts¶
Use ToolResult when:
- The tool produces binary data (images, PDFs, audio).
- The output is large or structured and the LLM only needs a summary.
- Downstream code needs the raw object, not a string representation.
If the LLM needs the full output to reason over, return a plain string instead.
Best Practices¶
- Use type hints — Enable automatic schema generation
- Write clear docstrings — Becomes tool description for LLM
- Handle errors gracefully — Return error strings, don't raise
- Make tools atomic — One clear purpose per tool
- Use async for I/O — All network/DB/file operations
- Validate inputs — Pydantic models for complex inputs
- Add retries for resilience — External calls can fail
- Log for observability — Track tool usage and errors
- Test thoroughly — Unit tests for all tools
- Document return types — Help LLM use results correctly
Common Pitfalls¶
| Issue | Problem | Solution |
|---|---|---|
| Missing type hints | No schema generated | Add type hints to all params |
| Unclear docstring | LLM misuses tool | Write clear, specific descriptions |
| Raising exceptions | Agent execution halts | Return error strings instead |
| Blocking I/O | Poor performance | Use async for all I/O operations |
| No error handling | Crashes on failures | Wrap in try/except |
| Too complex | LLM struggles to use | Split into multiple simpler tools |
Complex Types¶
Pydantic Models¶
from pydantic import BaseModel
class EmailRequest(BaseModel):
to: str
subject: str
body: str
cc: list[str] = []
@tool
def send_email(request: EmailRequest) -> str:
"""Send an email."""
return f"Sent to {request.to}"
Enum Parameters¶
from enum import Enum
class Priority(Enum):
LOW = "low"
MEDIUM = "medium"
HIGH = "high"
@tool
def create_task(title: str, priority: Priority = Priority.MEDIUM) -> str:
"""Create a task with priority."""
return f"Created: {title} ({priority.value})"
API Reference¶
Decorators¶
| Decorator | Description |
|---|---|
@tool |
Create a tool from a function |
Core Classes¶
| Class | Description |
|---|---|
BaseTool |
Base class for tools |
ToolResult |
Return content + sideband artifact from a tool |
Artifact |
Artifact entry on Response.artifacts |
Utility Functions¶
| Function | Description |
|---|---|
create_tool_from_function(fn) |
Create tool from function |
Exceptions¶
| Exception | Description |
|---|---|
Suspend(callback_id, message) |
Pause the agent run until agent.resume() is called |
Further Reading¶
- Toolkits — 12+ production-ready tool classes
- Agent Configuration — Using tools with agents
- Resilience — Error handling and retry policies