Skip to content

Embeddings

All 9 embedding providers support a standardized API with rich metadata and flexible usage patterns.

See Models Overview for the 3-tier API and factory functions.

All 9 embedding providers support a standardized API with rich metadata and flexible usage patterns:

from cogent.models import OpenAIEmbedding, GeminiEmbedding, OllamaEmbedding

embedder = OpenAIEmbedding(model="text-embedding-3-small")

# Primary API: embed() / aembed() - Returns EmbeddingResult with full metadata
result = await embedder.aembed(["Hello world", "Cogent"])
print(result.embeddings)            # list[list[float]] - the actual vectors
print(result.metadata.model)        # "text-embedding-3-small"
print(result.metadata.tokens)       # TokenUsage(prompt=4, completion=0, total=4)
print(result.metadata.dimensions)   # 1536
print(result.metadata.duration)     # 0.181 seconds
print(result.metadata.num_texts)    # 2

# Convenience: embed_one() / aembed_one() - Single text, returns vector only
vector = await embedder.aembed_one("Single text")
print(len(vector))  # 1536

# Sync versions
result = embedder.embed(["Text 1", "Text 2"])
vector = embedder.embed_one("Single text")

# VectorStore protocol: embed_texts() / embed_query() - Async, no metadata
vectors = await embedder.embed_texts(["Doc1", "Doc2"])  # list[list[float]]
query_vec = await embedder.embed_query("Search query")  # list[float]

Standardized API Summary:

Method Input Returns Async Metadata
embed(texts) list[str] EmbeddingResult
aembed(texts) list[str] EmbeddingResult
embed_one(text) str list[float]
aembed_one(text) str list[float]
embed_texts(texts) list[str] list[list[float]]
embed_query(text) str list[float]
dimension property int - -

Embedding Metadata

All 9 embedding providers return complete metadata:

Provider Token Usage Notes
OpenAI Extracts from response.usage.prompt_tokens
Cohere Extracts from response.meta.billed_units.input_tokens
Mistral Uses OpenAI SDK, provides token counts
Azure OpenAI Extracts from response.usage like OpenAI
Gemini API doesn't provide token counts for embeddings
Ollama Local embeddings, no token tracking
Cloudflare API doesn't track tokens
Mock Test embedding, no real tokens
Custom Conditional - depends on underlying API

Metadata Structure:

@dataclass
class EmbeddingMetadata:
    id: str                     # Unique request ID
    timestamp: str              # ISO 8601 timestamp
    model: str | None           # Model name/version
    tokens: TokenUsage | None   # Token usage (if available)
    duration: float             # Request duration (seconds)
    dimensions: int | None      # Vector dimensions
    num_texts: int              # Number of texts embedded

@dataclass
class EmbeddingResult:
    embeddings: list[list[float]]  # The actual embedding vectors
    metadata: EmbeddingMetadata    # Complete metadata

Usage Examples:

# Use case 1: Need metadata for cost tracking
result = await embedder.aembed(["Text 1", "Text 2"])
vectors = result.embeddings
tokens = result.metadata.tokens  # Track token usage for billing
duration = result.metadata.duration  # Monitor performance

# Use case 2: Simple embedding without metadata
vector = await embedder.aembed_one("Query text")  # Just returns the vector

# Use case 3: VectorStore integration (protocol compliance)
# These methods are used internally by VectorStore
vectors = await embedder.embed_texts(["Document 1", "Document 2"])
query_vec = await embedder.embed_query("Search query")

# Use case 4: Sync batch embedding
result = embedder.embed(large_batch)  # Sync version for compatibility

Observability Benefits:

  • Cost tracking — Monitor token usage across providers
  • Performance — Track request duration and batch sizes
  • Debugging — Trace requests with unique IDs and timestamps
  • Model versioning — Know which embedding model version was used
  • Capacity planning — Understand dimensions and text counts