Skip to content

Building Custom Tools

Create production-ready tools for cogent agents.

Overview

Tools are functions that agents can call to interact with the world — APIs, databases, files, web searches, and more. Cogent provides two ways to add tools:

  1. @tool decorator — Create custom tools from any function (this guide)
  2. Toolkits — Use pre-built tool classes like WebSearch, FileSystem, CodeSandbox

Toolkits are classes that provide tools. They handle setup, state management, and expose multiple related tools:

from cogent import Agent
from cogent.toolkits import WebSearch, FileSystem, CodeSandbox

agent = Agent(
    model="gpt-5.4-mini",
    tools=[
        WebSearch(),                          # Provides: search, fetch_url
        FileSystem(allowed_paths=["./data"]), # Provides: read, write, list
        CodeSandbox(),                        # Provides: execute_python
    ],
)

See Toolkits for 12+ production-ready toolkits — KnowledgeGraph, Browser, PDF, Shell, MCP, and more.

For custom logic, use the @tool decorator:

Quick Start

from cogent import tool

@tool
def get_weather(city: str) -> str:
    """Get current weather for a city.

    Args:
        city: The city name to get weather for.

    Returns:
        Weather information as a string.
    """
    # Your implementation
    return f"Weather in {city}: 72°F, sunny"

That's it! The @tool decorator automatically: - Extracts function signature → JSON schema - Parses docstring → tool description - Handles sync/async execution - Provides error handling

The @tool Decorator

from cogent import tool

@tool
def my_tool(param1: str, param2: int = 10) -> str:
    """Tool description goes here.

    Args:
        param1: Description of param1.
        param2: Description of param2 (optional).

    Returns:
        Description of return value.
    """
    return f"Result: {param1} x {param2}"

What Gets Generated

{
  "type": "function",
  "function": {
    "name": "my_tool",
    "description": "Tool description goes here.",
    "parameters": {
      "type": "object",
      "properties": {
        "param1": {
          "type": "string",
          "description": "Description of param1."
        },
        "param2": {
          "type": "integer",
          "description": "Description of param2 (optional).",
          "default": 10
        }
      },
      "required": ["param1"]
    }
  }
}

Return Type Information

The @tool decorator automatically extracts return type information and includes it in the tool description. This helps the LLM understand what output to expect:

@tool
def get_weather(city: str) -> dict[str, int]:
    """Get weather data for a city.

    Args:
        city: City name to query.

    Returns:
        A dictionary with temp, humidity, and wind_speed.
    """
    return {"temp": 75, "humidity": 45, "wind_speed": 10}

# LLM sees this description:
# "Get weather data for a city. Returns: dict[str, int] - A dictionary with temp, humidity, and wind_speed."

# Access the return info directly:
print(get_weather.return_info)
# Output: "dict[str, int] - A dictionary with temp, humidity, and wind_speed."

What gets extracted:

Source Example Result
Return type annotation -> str "str"
Generic types -> dict[str, int] "dict[str, int]"
Optional types -> str \| None "str \| None"
Docstring Returns section Returns: The result. "The result."
Both combined Type + docstring "dict[str, int] - A dictionary with..."

[!TIP] Always include a Returns: section in your docstrings to give the LLM context about the output format.

Decorator Options

@tool(
    name="web_search",           # Override function name
    description="Search the web",  # Override docstring
    return_direct=True,          # Return result directly to user
    cache=True,                  # Enable semantic caching (requires agent.cache)
)
def search(query: str, max_results: int = 10) -> str:
    """Search implementation."""
    return f"Found {max_results} results for: {query}"
Option Type Default Description
name str Function name Override the tool name
description str Docstring Override the tool description
return_direct bool False Return result directly to user without LLM processing
cache bool False Enable automatic semantic caching (see Semantic Caching)

Async Tools

Most production tools are async (API calls, database queries, file I/O):

from cogent import tool
import httpx

@tool
async def fetch_url(url: str) -> str:
    """Fetch content from a URL.

    Args:
        url: The URL to fetch.

    Returns:
        The response text.
    """
    async with httpx.AsyncClient() as client:
        response = await client.get(url)
        return response.text

The executor handles both sync and async tools automatically.

Type Hints

Use type hints for automatic schema generation:

from cogent import tool
from typing import Literal

@tool
def search(
    query: str,
    engine: Literal["google", "bing", "duckduckgo"] = "duckduckgo",
    max_results: int = 10,
) -> list[dict[str, str]]:
    """Search the web.

    Args:
        query: Search query string.
        engine: Search engine to use.
        max_results: Maximum number of results.

    Returns:
        List of search results with title and URL.
    """
    # Implementation
    return [{"title": "Result 1", "url": "https://..."}]

Supported types: - str, int, float, bool - list[T], dict[K, V] - Literal["option1", "option2"] - Optional[T] or T | None - Pydantic models

Error Handling

Always handle errors gracefully:

from cogent import tool
import httpx

@tool
async def safe_fetch(url: str) -> str:
    """Safely fetch URL with error handling."""
    try:
        async with httpx.AsyncClient(timeout=10.0) as client:
            response = await client.get(url)
            response.raise_for_status()
            return response.text
    except httpx.TimeoutException:
        return f"Error: Request to {url} timed out after 10 seconds"
    except httpx.HTTPStatusError as e:
        return f"Error: HTTP {e.response.status_code} from {url}"
    except Exception as e:
        return f"Error fetching {url}: {str(e)}"

Best practices: - Catch specific exceptions first - Provide helpful error messages - Return error as string (don't raise in tools) - Log errors for debugging

Tool Composition

Combine related tools into a toolkit class:

from cogent import tool

class Calculator:
    """Calculator with memory."""

    def __init__(self):
        self.memory: float = 0.0

    @tool
    def add(self, a: float, b: float) -> float:
        """Add two numbers."""
        result = a + b
        self.memory = result
        return result

    @tool
    def recall(self) -> float:
        """Recall last result from memory."""
        return self.memory

Pattern: Related tools in a class share state and configuration. This is exactly how toolkits work — see Toolkits.

Context Injection

Access agent context in tools:

from cogent import tool
from cogent.core.context import RunContext

@tool
async def get_user_data(
    user_id: int,
    ctx: RunContext,  # Auto-injected
) -> dict:
    """Get user data with context."""
    # Access agent context
    session_id = ctx.session_id
    user = ctx.user_id

    # Use context for logging, auth, etc
    logger.info(f"User {user} requesting data for {user_id} in session {session_id}")

    return await fetch_user(user_id)

Context fields: - session_id — Current session identifier - user_id — User making the request - metadata — Custom metadata dict - agent — Reference to the agent instance

Registering Tools with Agents

from cogent import Agent
from cogent.toolkits import WebSearch, FileSystem

# Custom tools + toolkits together
agent = Agent(
    model="gpt-5.4-mini",
    tools=[get_weather, my_custom_tool, WebSearch(), FileSystem()],  # Custom @tool functions + toolkits
)

All tools (custom functions and toolkits) are available to the agent.

Testing Tools

Use pytest for tool testing:

import pytest
from cogent import tool

@tool
async def divide(a: float, b: float) -> float:
    """Divide two numbers."""
    if b == 0:
        raise ValueError("Cannot divide by zero")
    return a / b

@pytest.mark.asyncio
async def test_divide_success():
    result = await divide(10, 2)
    assert result == 5.0

@pytest.mark.asyncio
async def test_divide_by_zero():
    with pytest.raises(ValueError, match="Cannot divide by zero"):
        await divide(10, 0)

Tool Patterns

Pattern 1: Retry with Exponential Backoff

import asyncio
from cogent import tool

@tool
async def resilient_api_call(url: str, max_retries: int = 3) -> str:
    """API call with retries."""
    for attempt in range(max_retries):
        try:
            return await fetch(url)
        except Exception as e:
            if attempt == max_retries - 1:
                return f"Failed after {max_retries} attempts: {e}"
            await asyncio.sleep(2 ** attempt)

Pattern 2: Rate Limiting

import asyncio
from cogent import tool

class RateLimitedAPI:
    def __init__(self, calls_per_second: int = 10):
        self.delay = 1.0 / calls_per_second
        self.last_call = 0.0

    @tool
    async def call_api(self, endpoint: str) -> str:
        """Call API with rate limiting."""
        now = asyncio.get_event_loop().time()
        wait_time = max(0, self.delay - (now - self.last_call))

        if wait_time > 0:
            await asyncio.sleep(wait_time)

        self.last_call = asyncio.get_event_loop().time()
        return await self._make_request(endpoint)

Pattern 3: Caching

Use @tool(cache=True) for automatic semantic caching. Similar queries return cached results:

from cogent import Agent, tool

@tool(cache=True)
async def search_products(query: str) -> str:
    """Search products in the catalog.

    Args:
        query: Search query for products.

    Returns:
        Product search results.
    """
    # Expensive API call - cached semantically
    return await product_api.search(query)

# Agent must have cache enabled
agent = Agent(
    model="gpt-5.4-mini",
    tools=[search_products],
    cache=True,  # Required for @tool(cache=True)
)

# First call executes the tool
await agent.run("Find running shoes")

# Similar query hits cache (semantic match)
await agent.run("Show me running sneakers")  # Cache hit!

How it works:

  1. Tool input is embedded using the agent's embedding model
  2. Cache checks for semantically similar previous calls
  3. If similarity exceeds threshold, cached result is returned
  4. Otherwise, tool executes and result is stored

Requirements:

  • Agent must have cache=True enabled
  • An embedding model must be configured (or uses default)
  • Tool must have cache=True in decorator

Simple LRU Caching

For exact-match caching (same input = same output):

from cogent import tool
from functools import lru_cache

@tool
@lru_cache(maxsize=100)
def cached_computation(input: str) -> str:
    """Expensive computation with exact-match caching."""
    # Cached result for identical input only
    return expensive_operation(input)

[!TIP] Use @tool(cache=True) for semantic similarity matching, @lru_cache for exact string matching.

Pattern 4: Validation

from cogent import tool
from pydantic import BaseModel, Field

class SearchParams(BaseModel):
    query: str = Field(min_length=1, max_length=500)
    max_results: int = Field(ge=1, le=100)

@tool
async def validated_search(params: SearchParams) -> list:
    """Search with validated parameters."""
    # Pydantic ensures constraints
    return await search(params.query, params.max_results)

Tool Execution

When an agent has multiple tools, they execute in parallel by default via NativeExecutor:

from cogent import Agent

agent = Agent(
    model="gpt-5.4-mini",
    tools=[fetch_weather, fetch_news, fetch_stock]
)

# If LLM requests multiple tools in one turn, they run concurrently
result = await agent.run("Get weather, news, and stock data")
# 3 tools × 0.5s each = ~0.5s total (parallel via asyncio.gather)

Execution behavior: - Parallel: When LLM requests multiple tools in one turn - Sequential: When LLM naturally calls tools one at a time across turns - LLM decides: Based on task requirements and your prompt

Configuration:

from cogent.executors import NativeExecutor

executor = NativeExecutor(
    agent,
    max_tool_calls_per_turn=50,  # Max tools per LLM response
    max_concurrent_tools=20,      # Tune for external API rate limits
    resilience=True               # Auto-retry on LLM rate limits
)

Standalone Execution

For quick tasks without creating an Agent:

from cogent.executors import run

result = await run(
    "Search for Python tutorials",
    tools=[search],
    model="gpt-5.4-mini",
)

Suspend / Resume

When a tool triggers a process that takes minutes, hours, or days — human approval, a batch pipeline, an external webhook — the tool can suspend the agent run. The application later calls agent.resume() with the real result.

If the delay is short enough to poll for, pass a check callback and a timeout — the executor polls automatically, and the tool completes without suspending at all.

Raising Suspend

from cogent import Suspend, tool

@tool
async def request_approval(document: str) -> str:
    """Submit a document for manager approval."""
    ticket_id = await ticketing.create(document)
    raise Suspend(
        callback_id=ticket_id,
        message=f"Approval request {ticket_id} sent to manager.",
    )

callback_id is an application-defined string (job ID, ticket number, etc.) that lets you match the resume call to the original suspension.

message is the interim text returned to the LLM so it can acknowledge the pause before the run ends.

Auto-Resolve with check

When the result might arrive within seconds, add a check callback:

@tool
async def generate_report(topic: str) -> str:
    """Generate a report — finishes within seconds."""
    job_id = await reports.submit(topic)

    raise Suspend(
        callback_id=job_id,
        message=f"Report {job_id} generation started.",
        check=lambda: reports.get_result(job_id),  # () -> str | None
        timeout=30.0,   # poll for up to 30 seconds
        interval=2.0,   # check every 2 seconds
    )

The executor calls check() every interval seconds. If it returns a non-None string before the deadline, the tool completes normally — no suspension, no external resume needed. If the timeout expires first, the run suspends as usual.

Pass timeout=None to poll indefinitely — the tool waits as long as it takes for check() to return a result. Use this when the external service always has a polling / status endpoint:

@tool
async def request_approval(document: str) -> str:
    """Submit for approval — polls until approved."""
    ticket_id = await approval_api.submit(document)

    raise Suspend(
        callback_id=ticket_id,
        message=f"Approval {ticket_id} submitted.",
        check=lambda: approval_api.get_status(ticket_id),
        timeout=None,    # wait indefinitely
        interval=60.0,   # check every minute
    )

check can be sync or async. Other parallel tools in the same batch run concurrently during polling.

Detecting Suspension

agent.run() returns a Response with suspended=True and the callback_id:

resp = await agent.run("Submit the Q3 budget for approval", thread_id="t1")

if resp.suspended:
    # Save callback_id, register a webhook, start a timer, etc.
    save_pending(resp.callback_id, thread_id="t1")

Resuming

When the external process completes, feed the result back:

final = await agent.resume(
    thread_id="t1",
    callback_id=resp.callback_id,
    result="Approved with minor edits.",
)
print(final.content)  # Agent's follow-up using the real result

resume() requires conversation memory so the agent can recall the original context. It accepts the same optional parameters as run() (context, max_iterations, reasoning, returns).

Observability

Suspend emits a tool.suspended event with callback_id and message fields. Auto-resolve polling emits a tool.waiting event before polling starts. The console formatter renders both with status indicators.

When to Use Each Pattern

Scenario Approach
Finishes in seconds (in-process) await inside the tool
Finishes in seconds (external job) Suspend with check + timeout
Finishes in minutes (same run) Submit + check two-tool pattern
Finishes in hours/days, or needs human action Suspend / resume()

Tool Artifacts

Tools sometimes produce rich data — images, DataFrames, raw bytes — that the application needs but the LLM does not. Tool artifacts let a tool return sideband data that is kept off the LLM context and collected on response.artifacts.

Returning an Artifact

Return a ToolResult instead of a plain string:

from cogent import tool, ToolResult

@tool
def generate_chart(topic: str) -> ToolResult:
    """Render a PNG chart.

    Args:
        topic: Chart subject.
    """
    png_bytes = render(topic)  # your rendering logic
    return ToolResult(
        content=f"Chart for '{topic}' rendered (PNG, {len(png_bytes)} bytes).",
        artifact=png_bytes,
        name="chart",
        mime_type="image/png",
    )
Field Purpose
content Text the LLM sees (keep it short and descriptive)
artifact Any Python object — bytes, dict, list, dataclass, etc. Never sent to the LLM
name Label for the artifact (defaults to the tool name)
mime_type Optional MIME type hint for downstream consumers

Consuming Artifacts

After an agent run, artifacts are available on the response:

result = await agent.run("Generate a sales chart.")

for art in result.artifacts:
    print(art.name, art.mime_type, type(art.data))
    if art.mime_type == "image/png":
        Path("chart.png").write_bytes(art.data)

Each Artifact on the response carries:

Field Description
data The original object returned by the tool
name Artifact name
mime_type MIME type (if provided)
tool_name Name of the tool that produced it
tool_call_id The LLM tool-call ID that triggered the tool

When to Use Artifacts

Use ToolResult when:

  • The tool produces binary data (images, PDFs, audio).
  • The output is large or structured and the LLM only needs a summary.
  • Downstream code needs the raw object, not a string representation.

If the LLM needs the full output to reason over, return a plain string instead.

Best Practices

  1. Use type hints — Enable automatic schema generation
  2. Write clear docstrings — Becomes tool description for LLM
  3. Handle errors gracefully — Return error strings, don't raise
  4. Make tools atomic — One clear purpose per tool
  5. Use async for I/O — All network/DB/file operations
  6. Validate inputs — Pydantic models for complex inputs
  7. Add retries for resilience — External calls can fail
  8. Log for observability — Track tool usage and errors
  9. Test thoroughly — Unit tests for all tools
  10. Document return types — Help LLM use results correctly

Common Pitfalls

Issue Problem Solution
Missing type hints No schema generated Add type hints to all params
Unclear docstring LLM misuses tool Write clear, specific descriptions
Raising exceptions Agent execution halts Return error strings instead
Blocking I/O Poor performance Use async for all I/O operations
No error handling Crashes on failures Wrap in try/except
Too complex LLM struggles to use Split into multiple simpler tools

Complex Types

Pydantic Models

from pydantic import BaseModel

class EmailRequest(BaseModel):
    to: str
    subject: str
    body: str
    cc: list[str] = []

@tool
def send_email(request: EmailRequest) -> str:
    """Send an email."""
    return f"Sent to {request.to}"

Enum Parameters

from enum import Enum

class Priority(Enum):
    LOW = "low"
    MEDIUM = "medium"
    HIGH = "high"

@tool
def create_task(title: str, priority: Priority = Priority.MEDIUM) -> str:
    """Create a task with priority."""
    return f"Created: {title} ({priority.value})"

API Reference

Decorators

Decorator Description
@tool Create a tool from a function

Core Classes

Class Description
BaseTool Base class for tools
ToolResult Return content + sideband artifact from a tool
Artifact Artifact entry on Response.artifacts

Utility Functions

Function Description
create_tool_from_function(fn) Create tool from function

Exceptions

Exception Description
Suspend(callback_id, message) Pause the agent run until agent.resume() is called

Further Reading