Building Custom Tools¶

Create production-ready tools for cogent agents.

Overview¶

Tools are functions that agents can call to interact with the world — APIs, databases, files, web searches, and more. Cogent provides two ways to add tools:

@tool decorator — Create custom tools from any function (this guide)
Toolkits — Use pre-built tool classes like WebSearch, FileSystem, CodeSandbox

Toolkits are classes that provide tools. They handle setup, state management, and expose multiple related tools:

from cogent import Agent
from cogent.toolkits import WebSearch, FileSystem, CodeSandbox

agent = Agent(
    model="gpt-5.4-mini",
    tools=[
        WebSearch(),                          # Provides: search, fetch_url
        FileSystem(allowed_paths=["./data"]), # Provides: read, write, list
        CodeSandbox(),                        # Provides: execute_python
    ],
)

See Toolkits for 12+ production-ready toolkits — KnowledgeGraph, Browser, PDF, Shell, MCP, and more.

For custom logic, use the @tool decorator:

Quick Start¶

from cogent import tool

@tool
def get_weather(city: str) -> str:
    """Get current weather for a city.

    Args:
        city: The city name to get weather for.

    Returns:
        Weather information as a string.
    """
    # Your implementation
    return f"Weather in {city}: 72°F, sunny"

That's it! The @tool decorator automatically: - Extracts function signature → JSON schema - Parses docstring → tool description - Handles sync/async execution - Provides error handling

The @tool Decorator¶

from cogent import tool

@tool
def my_tool(param1: str, param2: int = 10) -> str:
    """Tool description goes here.

    Args:
        param1: Description of param1.
        param2: Description of param2 (optional).

    Returns:
        Description of return value.
    """
    return f"Result: {param1} x {param2}"

What Gets Generated¶

{
  "type": "function",
  "function": {
    "name": "my_tool",
    "description": "Tool description goes here.",
    "parameters": {
      "type": "object",
      "properties": {
        "param1": {
          "type": "string",
          "description": "Description of param1."
        },
        "param2": {
          "type": "integer",
          "description": "Description of param2 (optional).",
          "default": 10
        }
      },
      "required": ["param1"]
    }
  }
}

Return Type Information¶

The @tool decorator automatically extracts return type information and includes it in the tool description. This helps the LLM understand what output to expect:

@tool
def get_weather(city: str) -> dict[str, int]:
    """Get weather data for a city.

    Args:
        city: City name to query.

    Returns:
        A dictionary with temp, humidity, and wind_speed.
    """
    return {"temp": 75, "humidity": 45, "wind_speed": 10}

# LLM sees this description:
# "Get weather data for a city. Returns: dict[str, int] - A dictionary with temp, humidity, and wind_speed."

# Access the return info directly:
print(get_weather.return_info)
# Output: "dict[str, int] - A dictionary with temp, humidity, and wind_speed."

What gets extracted:

Source	Example	Result
Return type annotation	`-> str`	`"str"`
Generic types	`-> dict[str, int]`	`"dict[str, int]"`
Optional types	`-> str \\| None`	`"str \\| None"`
Docstring Returns section	`Returns: The result.`	`"The result."`
Both combined	Type + docstring	`"dict[str, int] - A dictionary with..."`

[!TIP] Always include a Returns: section in your docstrings to give the LLM context about the output format.

Decorator Options¶

@tool(
    name="web_search",           # Override function name
    description="Search the web",  # Override docstring
    return_direct=True,          # Return result directly to user
    cache=True,                  # Enable semantic caching (requires agent.cache)
)
def search(query: str, max_results: int = 10) -> str:
    """Search implementation."""
    return f"Found {max_results} results for: {query}"

Option	Type	Default	Description
`name`	`str`	Function name	Override the tool name
`description`	`str`	Docstring	Override the tool description
`return_direct`	`bool`	`False`	Return result directly to user without LLM processing
`cache`	`bool`	`False`	Enable automatic semantic caching (see Semantic Caching)

Async Tools¶

Most production tools are async (API calls, database queries, file I/O):

from cogent import tool
import httpx

@tool
async def fetch_url(url: str) -> str:
    """Fetch content from a URL.

    Args:
        url: The URL to fetch.

    Returns:
        The response text.
    """
    async with httpx.AsyncClient() as client:
        response = await client.get(url)
        return response.text

The executor handles both sync and async tools automatically.

Type Hints¶

Use type hints for automatic schema generation:

from cogent import tool
from typing import Literal

@tool
def search(
    query: str,
    engine: Literal["google", "bing", "duckduckgo"] = "duckduckgo",
    max_results: int = 10,
) -> list[dict[str, str]]:
    """Search the web.

    Args:
        query: Search query string.
        engine: Search engine to use.
        max_results: Maximum number of results.

    Returns:
        List of search results with title and URL.
    """
    # Implementation
    return [{"title": "Result 1", "url": "https://..."}]

Supported types: - str, int, float, bool - list[T], dict[K, V] - Literal["option1", "option2"] - Optional[T] or T | None - Pydantic models

Error Handling¶

Always handle errors gracefully:

from cogent import tool
import httpx

@tool
async def safe_fetch(url: str) -> str:
    """Safely fetch URL with error handling."""
    try:
        async with httpx.AsyncClient(timeout=10.0) as client:
            response = await client.get(url)
            response.raise_for_status()
            return response.text
    except httpx.TimeoutException:
        return f"Error: Request to {url} timed out after 10 seconds"
    except httpx.HTTPStatusError as e:
        return f"Error: HTTP {e.response.status_code} from {url}"
    except Exception as e:
        return f"Error fetching {url}: {str(e)}"

Best practices: - Catch specific exceptions first - Provide helpful error messages - Return error as string (don't raise in tools) - Log errors for debugging

Tool Composition¶

Combine related tools into a toolkit class:

from cogent import tool

class Calculator:
    """Calculator with memory."""

    def __init__(self):
        self.memory: float = 0.0

    @tool
    def add(self, a: float, b: float) -> float:
        """Add two numbers."""
        result = a + b
        self.memory = result
        return result

    @tool
    def recall(self) -> float:
        """Recall last result from memory."""
        return self.memory

Pattern: Related tools in a class share state and configuration. This is exactly how toolkits work — see Toolkits.

Context Injection¶

Access agent context in tools:

from cogent import tool
from cogent.core.context import RunContext

@tool
async def get_user_data(
    user_id: int,
    ctx: RunContext,  # Auto-injected
) -> dict:
    """Get user data with context."""
    # Access agent context
    session_id = ctx.session_id
    user = ctx.user_id

    # Use context for logging, auth, etc
    logger.info(f"User {user} requesting data for {user_id} in session {session_id}")

    return await fetch_user(user_id)

Context fields: - session_id — Current session identifier - user_id — User making the request - metadata — Custom metadata dict - agent — Reference to the agent instance

Registering Tools with Agents¶

from cogent import Agent
from cogent.toolkits import WebSearch, FileSystem

# Custom tools + toolkits together
agent = Agent(
    model="gpt-5.4-mini",
    tools=[get_weather, my_custom_tool, WebSearch(), FileSystem()],  # Custom @tool functions + toolkits
)

All tools (custom functions and toolkits) are available to the agent.

Testing Tools¶

Use pytest for tool testing:

import pytest
from cogent import tool

@tool
async def divide(a: float, b: float) -> float:
    """Divide two numbers."""
    if b == 0:
        raise ValueError("Cannot divide by zero")
    return a / b

@pytest.mark.asyncio
async def test_divide_success():
    result = await divide(10, 2)
    assert result == 5.0

@pytest.mark.asyncio
async def test_divide_by_zero():
    with pytest.raises(ValueError, match="Cannot divide by zero"):
        await divide(10, 0)

Tool Patterns¶

Pattern 1: Retry with Exponential Backoff¶

import asyncio
from cogent import tool

@tool
async def resilient_api_call(url: str, max_retries: int = 3) -> str:
    """API call with retries."""
    for attempt in range(max_retries):
        try:
            return await fetch(url)
        except Exception as e:
            if attempt == max_retries - 1:
                return f"Failed after {max_retries} attempts: {e}"
            await asyncio.sleep(2 ** attempt)

Pattern 2: Rate Limiting¶

import asyncio
from cogent import tool

class RateLimitedAPI:
    def __init__(self, calls_per_second: int = 10):
        self.delay = 1.0 / calls_per_second
        self.last_call = 0.0

    @tool
    async def call_api(self, endpoint: str) -> str:
        """Call API with rate limiting."""
        now = asyncio.get_event_loop().time()
        wait_time = max(0, self.delay - (now - self.last_call))

        if wait_time > 0:
            await asyncio.sleep(wait_time)

        self.last_call = asyncio.get_event_loop().time()
        return await self._make_request(endpoint)

Pattern 3: Caching¶

Semantic Caching (Recommended)¶

Use @tool(cache=True) for automatic semantic caching. Similar queries return cached results:

from cogent import Agent, tool

@tool(cache=True)
async def search_products(query: str) -> str:
    """Search products in the catalog.

    Args:
        query: Search query for products.

    Returns:
        Product search results.
    """
    # Expensive API call - cached semantically
    return await product_api.search(query)

# Agent must have cache enabled
agent = Agent(
    model="gpt-5.4-mini",
    tools=[search_products],
    cache=True,  # Required for @tool(cache=True)
)

# First call executes the tool
await agent.run("Find running shoes")

# Similar query hits cache (semantic match)
await agent.run("Show me running sneakers")  # Cache hit!

How it works:

Tool input is embedded using the agent's embedding model
Cache checks for semantically similar previous calls
If similarity exceeds threshold, cached result is returned
Otherwise, tool executes and result is stored

Requirements:

Agent must have cache=True enabled
An embedding model must be configured (or uses default)
Tool must have cache=True in decorator

Simple LRU Caching¶

For exact-match caching (same input = same output):

from cogent import tool
from functools import lru_cache

@tool
@lru_cache(maxsize=100)
def cached_computation(input: str) -> str:
    """Expensive computation with exact-match caching."""
    # Cached result for identical input only
    return expensive_operation(input)

[!TIP] Use @tool(cache=True) for semantic similarity matching, @lru_cache for exact string matching.

Pattern 4: Validation¶

from cogent import tool
from pydantic import BaseModel, Field

class SearchParams(BaseModel):
    query: str = Field(min_length=1, max_length=500)
    max_results: int = Field(ge=1, le=100)

@tool
async def validated_search(params: SearchParams) -> list:
    """Search with validated parameters."""
    # Pydantic ensures constraints
    return await search(params.query, params.max_results)

Tool Execution¶

When an agent has multiple tools, they execute in parallel by default via NativeExecutor:

from cogent import Agent

agent = Agent(
    model="gpt-5.4-mini",
    tools=[fetch_weather, fetch_news, fetch_stock]
)

# If LLM requests multiple tools in one turn, they run concurrently
result = await agent.run("Get weather, news, and stock data")
# 3 tools × 0.5s each = ~0.5s total (parallel via asyncio.gather)

Execution behavior: - Parallel: When LLM requests multiple tools in one turn - Sequential: When LLM naturally calls tools one at a time across turns - LLM decides: Based on task requirements and your prompt

Configuration:

from cogent.executors import NativeExecutor

executor = NativeExecutor(
    agent,
    max_tool_calls_per_turn=50,  # Max tools per LLM response
    max_concurrent_tools=20,      # Tune for external API rate limits
    resilience=True               # Auto-retry on LLM rate limits
)

Standalone Execution¶

For quick tasks without creating an Agent:

from cogent.executors import run

result = await run(
    "Search for Python tutorials",
    tools=[search],
    model="gpt-5.4-mini",
)

Suspend / Resume¶

When a tool triggers a process that takes minutes, hours, or days — human approval, a batch pipeline, an external webhook — the tool can suspend the agent run. The application later calls agent.resume() with the real result.

If the delay is short enough to poll for, pass a check callback and a timeout — the executor polls automatically, and the tool completes without suspending at all.

Raising Suspend¶

from cogent import Suspend, tool

@tool
async def request_approval(document: str) -> str:
    """Submit a document for manager approval."""
    ticket_id = await ticketing.create(document)
    raise Suspend(
        callback_id=ticket_id,
        message=f"Approval request {ticket_id} sent to manager.",
    )

callback_id is an application-defined string (job ID, ticket number, etc.) that lets you match the resume call to the original suspension.

message is the interim text returned to the LLM so it can acknowledge the pause before the run ends.

Auto-Resolve with `check`¶

When the result might arrive within seconds, add a check callback:

@tool
async def generate_report(topic: str) -> str:
    """Generate a report — finishes within seconds."""
    job_id = await reports.submit(topic)

    raise Suspend(
        callback_id=job_id,
        message=f"Report {job_id} generation started.",
        check=lambda: reports.get_result(job_id),  # () -> str | None
        timeout=30.0,   # poll for up to 30 seconds
        interval=2.0,   # check every 2 seconds
    )

The executor calls check() every interval seconds. If it returns a non-None string before the deadline, the tool completes normally — no suspension, no external resume needed. If the timeout expires first, the run suspends as usual.

Pass timeout=None to poll indefinitely — the tool waits as long as it takes for check() to return a result. Use this when the external service always has a polling / status endpoint:

@tool
async def request_approval(document: str) -> str:
    """Submit for approval — polls until approved."""
    ticket_id = await approval_api.submit(document)

    raise Suspend(
        callback_id=ticket_id,
        message=f"Approval {ticket_id} submitted.",
        check=lambda: approval_api.get_status(ticket_id),
        timeout=None,    # wait indefinitely
        interval=60.0,   # check every minute
    )

check can be sync or async. Other parallel tools in the same batch run concurrently during polling.

Detecting Suspension¶

agent.run() returns a Response with suspended=True and the callback_id:

resp = await agent.run("Submit the Q3 budget for approval", thread_id="t1")

if resp.suspended:
    # Save callback_id, register a webhook, start a timer, etc.
    save_pending(resp.callback_id, thread_id="t1")

Resuming¶

When the external process completes, feed the result back:

final = await agent.resume(
    thread_id="t1",
    callback_id=resp.callback_id,
    result="Approved with minor edits.",
)
print(final.content)  # Agent's follow-up using the real result

resume() requires conversation memory so the agent can recall the original context. It accepts the same optional parameters as run() (context, max_iterations, reasoning, returns).

Observability¶

Suspend emits a tool.suspended event with callback_id and message fields. Auto-resolve polling emits a tool.waiting event before polling starts. The console formatter renders both with status indicators.

When to Use Each Pattern¶

Scenario	Approach
Finishes in seconds (in-process)	`await` inside the tool
Finishes in seconds (external job)	`Suspend` with `check` + `timeout`
Finishes in minutes (same run)	Submit + check two-tool pattern
Finishes in hours/days, or needs human action	`Suspend` / `resume()`

Tool Artifacts¶

Tools sometimes produce rich data — images, DataFrames, raw bytes — that the application needs but the LLM does not. Tool artifacts let a tool return sideband data that is kept off the LLM context and collected on response.artifacts.

Returning an Artifact¶

Return a ToolResult instead of a plain string:

from cogent import tool, ToolResult

@tool
def generate_chart(topic: str) -> ToolResult:
    """Render a PNG chart.

    Args:
        topic: Chart subject.
    """
    png_bytes = render(topic)  # your rendering logic
    return ToolResult(
        content=f"Chart for '{topic}' rendered (PNG, {len(png_bytes)} bytes).",
        artifact=png_bytes,
        name="chart",
        mime_type="image/png",
    )

Field	Purpose
`content`	Text the LLM sees (keep it short and descriptive)
`artifact`	Any Python object — bytes, dict, list, dataclass, etc. Never sent to the LLM
`name`	Label for the artifact (defaults to the tool name)
`mime_type`	Optional MIME type hint for downstream consumers

Consuming Artifacts¶

After an agent run, artifacts are available on the response:

result = await agent.run("Generate a sales chart.")

for art in result.artifacts:
    print(art.name, art.mime_type, type(art.data))
    if art.mime_type == "image/png":
        Path("chart.png").write_bytes(art.data)

Each Artifact on the response carries:

Field	Description
`data`	The original object returned by the tool
`name`	Artifact name
`mime_type`	MIME type (if provided)
`tool_name`	Name of the tool that produced it
`tool_call_id`	The LLM tool-call ID that triggered the tool

When to Use Artifacts¶

Use ToolResult when:

The tool produces binary data (images, PDFs, audio).
The output is large or structured and the LLM only needs a summary.
Downstream code needs the raw object, not a string representation.

If the LLM needs the full output to reason over, return a plain string instead.

Best Practices¶

Use type hints — Enable automatic schema generation
Write clear docstrings — Becomes tool description for LLM
Handle errors gracefully — Return error strings, don't raise
Make tools atomic — One clear purpose per tool
Use async for I/O — All network/DB/file operations
Validate inputs — Pydantic models for complex inputs
Add retries for resilience — External calls can fail
Log for observability — Track tool usage and errors
Test thoroughly — Unit tests for all tools
Document return types — Help LLM use results correctly

Common Pitfalls¶

Issue	Problem	Solution
Missing type hints	No schema generated	Add type hints to all params
Unclear docstring	LLM misuses tool	Write clear, specific descriptions
Raising exceptions	Agent execution halts	Return error strings instead
Blocking I/O	Poor performance	Use async for all I/O operations
No error handling	Crashes on failures	Wrap in try/except
Too complex	LLM struggles to use	Split into multiple simpler tools

Complex Types¶

Pydantic Models¶

from pydantic import BaseModel

class EmailRequest(BaseModel):
    to: str
    subject: str
    body: str
    cc: list[str] = []

@tool
def send_email(request: EmailRequest) -> str:
    """Send an email."""
    return f"Sent to {request.to}"

Enum Parameters¶

from enum import Enum

class Priority(Enum):
    LOW = "low"
    MEDIUM = "medium"
    HIGH = "high"

@tool
def create_task(title: str, priority: Priority = Priority.MEDIUM) -> str:
    """Create a task with priority."""
    return f"Created: {title} ({priority.value})"

API Reference¶

Decorators¶

Decorator	Description
`@tool`	Create a tool from a function

Core Classes¶

Class	Description
`BaseTool`	Base class for tools
`ToolResult`	Return content + sideband artifact from a tool
`Artifact`	Artifact entry on `Response.artifacts`

Utility Functions¶

Function	Description
`create_tool_from_function(fn)`	Create tool from function

Exceptions¶

Exception	Description
`Suspend(callback_id, message)`	Pause the agent run until `agent.resume()` is called