Agents

An agent is Sagewai's core building block: it wraps an LLM with a tool-calling loop. You send a message; the LLM responds (optionally requesting tool calls); tools execute; the loop continues until the LLM returns a final text response. This page covers the available agent types, their constructors, and how to compose them into pipelines.

Prerequisites: Get started · Next: Strategies · Memory & RAG

BaseAgent

BaseAgent is the abstract foundation all agents inherit from. You never instantiate it directly — use one of the concrete engines below.

Architecture

Loading diagram...

Constructor Parameters

Parameter	Type	Default	Description
`name`	`str`	required	Agent name (used in logging, events, admin)
`model`	`str`	`"gpt-4o"`	LLM model identifier
`system_prompt`	`str`	`""`	System message prepended to all conversations
`tools`	`list[ToolSpec]`	`[]`	Tools the agent can use
`temperature`	`float`	`0.7`	LLM temperature
`max_tokens`	`int \| None`	`None`	Max output tokens per LLM call
`max_iterations`	`int`	`10`	Max tool-calling loop iterations
`strategy`	`ExecutionStrategy`	`ReActStrategy()`	Reasoning loop strategy
`memory`	`Any`	`None`	Memory backend (ContextEngine, VectorMemory, RAGEngine)
`guardrails`	`list[Guardrail]`	`[]`	Input/output safety guardrails
`max_context_tokens`	`int \| None`	`None`	Auto-compact context when exceeded
`directives`	`bool \| DirectiveEngine`	`None`	Enable directive preprocessing
`api_base`	`str \| None`	`None`	Override LLM API base URL
`api_key`	`str \| None`	`None`	Override LLM API key

Public Methods

`chat(message: str) -> str`

Send a single message and get a text response. This is the simplest entry point.

response = await agent.chat("What is quantum computing?")

`chat_with_history(messages: list[ChatMessage]) -> ChatMessage`

Run the agent loop against an explicit conversation history. Use this when you manage multi-turn state yourself rather than delegating to ConversationManager.

from sagewai import ChatMessage

messages = [
    ChatMessage.system("You are an expert physicist."),
    ChatMessage.user("Explain quantum entanglement"),
]
response = await agent.chat_with_history(messages)

`chat_stream(message: str) -> AsyncGenerator[str, None]`

Stream text chunks as they arrive. Tool calls are handled internally — only the final text content is yielded.

async for chunk in agent.chat_stream("Tell me about black holes"):
    print(chunk, end="", flush=True)

`on_event(callback)`

from sagewai.core.events import AgentEvent

async def my_handler(event: AgentEvent, data: dict):
    print(f"Event: {event.value}, Data: {data}")

agent.on_event(my_handler)

Events emitted: RUN_STARTED, RUN_FINISHED, RUN_ERROR, RUN_CANCELLED, STEP_STARTED, STEP_FINISHED, TOOL_CALL_START, TOOL_CALL_END, TOOL_CALL_RESULT, TEXT_MESSAGE_START, TEXT_MESSAGE_CONTENT, TEXT_MESSAGE_END, GUARDRAIL_ESCALATION, CONTEXT_COMPACTED.

UniversalAgent

UniversalAgent is the standard concrete agent. It routes to 100+ LLM providers through a single interface — swap model to change the provider.

from sagewai import UniversalAgent

agent = UniversalAgent(
    name="assistant",
    model="gpt-4o",
    system_prompt="You are a helpful assistant.",
    temperature=0.7,
)

response = await agent.chat("Hello!")

Supported Models

Provider	Model Examples	Prefix
OpenAI	`gpt-4o`, `gpt-4o-mini`, `o1-preview`	(none)
Anthropic	`claude-sonnet-4-20250514`, `claude-3-haiku-20240307`	(none)
Google Gemini	`gemini/gemini-2.0-flash`, `gemini/gemini-2.5-pro`	`gemini/`
Mistral	`mistral/mistral-large-latest`	`mistral/`
Cohere	`command-r-plus`	(none)
Azure OpenAI	`azure/gpt-4o`	`azure/`
AWS Bedrock	`bedrock/anthropic.claude-3-sonnet`	`bedrock/`
Ollama (local)	`ollama/llama3.1`, `ollama/codellama`	`ollama/`
Together AI	`together_ai/meta-llama/Llama-3.1-405B`	`together_ai/`
Groq	`groq/llama-3.1-70b-versatile`	`groq/`

Streaming

UniversalAgent streams at the token level via _stream_llm. It accumulates tool call fragments across chunks so that your async for loop only sees text content:

async for chunk in agent.chat_stream("Explain relativity"):
    print(chunk, end="")

GoogleNativeAgent

GoogleNativeAgent calls the google.genai SDK directly, bypassing the standard provider routing layer. Use it when you need Gemini-specific features that are not exposed through the unified interface.

from sagewai import GoogleNativeAgent

agent = GoogleNativeAgent(
    name="gemini-agent",
    model="gemini-2.0-flash",
    system_prompt="You are a helpful assistant.",
)

response = await agent.chat("Hello!")

When to pick GoogleNativeAgent over UniversalAgent:

You need native Gemini function calling format
You are using Vertex AI
You need direct access to google.genai SDK features

For everything else, use UniversalAgent with model="gemini/...".

Tools

Define tools with the @tool decorator. It converts a plain function into a ToolSpec — extracting the name, description, and parameter schema automatically:

from sagewai import tool

@tool
async def search_database(query: str, limit: int = 10) -> str:
    """Search the knowledge base for relevant documents.

    Args:
        query: The search query string.
        limit: Maximum number of results to return.
    """
    results = await db.search(query, limit=limit)
    return format_results(results)

The decorator reads:

Name from the function name
Description from the docstring
Parameters from type annotations
Handler reference for execution

Both sync and async functions work. When the LLM calls multiple tools in a single response, Sagewai executes them in parallel.

MCP Tools

Discover tools from any MCP (Model Context Protocol) server and pass them directly to an agent:

from sagewai import McpClient

# Connect via stdio
tools = await McpClient.connect(["python", "-m", "mcp_stripe"])

# Or via SSE
tools = await McpClient.connect_sse("http://localhost:8080/sse")

# Use discovered tools with any agent
agent = UniversalAgent(name="auditor", model="gpt-4o", tools=tools)

Agent Composition

Sagewai provides four workflow agents that compose sub-agents into deterministic pipelines. The workflow controls the structure; each sub-agent uses its own LLM for the actual work.

SequentialAgent

Execute sub-agents one after another. Each agent's output becomes the next agent's input:

from sagewai import UniversalAgent, SequentialAgent

researcher = UniversalAgent(
    name="researcher",
    model="gpt-4o",
    system_prompt="You research topics and return key findings.",
)
writer = UniversalAgent(
    name="writer",
    model="claude-sonnet-4-20250514",
    system_prompt="You write polished articles from research notes.",
)
reviewer = UniversalAgent(
    name="reviewer",
    model="gpt-4o-mini",
    system_prompt="You review articles for accuracy.",
)

pipeline = SequentialAgent(
    name="article-pipeline",
    agents=[researcher, writer, reviewer],
)

result = await pipeline.chat("Write about the future of quantum computing")

ParallelAgent

Run multiple agents on the same input at the same time. Their outputs are merged:

from sagewai import UniversalAgent, ParallelAgent

legal = UniversalAgent(name="legal", system_prompt="Review for legal issues.")
financial = UniversalAgent(name="financial", system_prompt="Review for financial accuracy.")
grammar = UniversalAgent(name="grammar", model="gpt-4o-mini", system_prompt="Review grammar.")

review_panel = ParallelAgent(
    name="review-panel",
    agents=[legal, financial, grammar],
)

result = await review_panel.chat("Review this contract: ...")

All agents run via asyncio.gather(). By default, results are joined with newlines. Pass a custom merge function to change that.

ConditionalAgent

Route input to a specific agent based on a condition function:

from sagewai import UniversalAgent, ConditionalAgent

escalation = UniversalAgent(name="escalation", system_prompt="Handle complaints.")
auto_reply = UniversalAgent(name="auto-reply", system_prompt="Respond helpfully.")

router = ConditionalAgent(
    name="sentiment-router",
    condition=lambda text: "negative" if "terrible" in text.lower() else "positive",
    branches={
        "negative": escalation,
        "positive": auto_reply,
    },
    default_branch=auto_reply,
)

result = await router.chat("This product is terrible!")
# Routes to the escalation agent

The condition can be sync or async. For LLM-based classification, pass an async function that calls a classifier model.

LoopAgent

Repeat a single agent until a stop condition is met or max_iterations runs out:

from sagewai import UniversalAgent, LoopAgent

refiner = UniversalAgent(
    name="refiner",
    model="gpt-4o",
    system_prompt="Improve the text. Output DONE when satisfied.",
)

loop = LoopAgent(
    name="iterative-refiner",
    agent=refiner,
    max_iterations=5,
    should_stop=lambda result, iteration: "DONE" in result,
)

result = await loop.chat("Draft: AI is good at many things...")

Agent-as-Tool

Wrap an agent as a tool so that an orchestrator can delegate to sub-agents dynamically, based on LLM reasoning rather than fixed rules:

from sagewai import UniversalAgent, agent_as_tool

researcher = UniversalAgent(name="researcher", model="gpt-4o")
writer = UniversalAgent(name="writer", model="claude-sonnet-4-20250514")

orchestrator = UniversalAgent(
    name="orchestrator",
    model="gpt-4o",
    tools=[
        agent_as_tool(researcher, description="Researches a topic thoroughly"),
        agent_as_tool(writer, description="Writes polished content"),
    ],
)

result = await orchestrator.chat("Research and write about quantum computing")

Here the orchestrator's LLM decides which sub-agents to call and in what order. This differs from SequentialAgent (fixed order) and ConditionalAgent (condition-based routing) — the LLM drives the delegation.

Choosing an Agent Pattern

Pattern	Orchestration	Use Case
Single `UniversalAgent`	LLM decides everything	Simple Q&A, single-domain tasks
`SequentialAgent`	Fixed pipeline	Research -> Write -> Review
`ParallelAgent`	Fan-out, merge	Multi-perspective analysis
`ConditionalAgent`	Rule-based routing	Intent classification, triage
`LoopAgent`	Iterative refinement	Edit until quality threshold
`agent_as_tool`	LLM-decided delegation	Dynamic multi-agent orchestration

These patterns compose. A SequentialAgent can include a ParallelAgent as one of its steps, which itself holds UniversalAgent sub-agents with different models and strategies.

What's Next

Strategies — Control how agents reason: ReAct, Tree of Thoughts, LATS, Planning
Memory & RAG — Add long-term memory with vector, graph, and hybrid retrieval
Workflows — Durable execution with checkpointing, human approval, and distributed workers