Agents

An agent is Sagewai's core building block: it wraps an LLM with a tool-calling loop. You send a message; the LLM responds (optionally requesting tool calls); tools execute; the loop continues until the LLM returns a final text response. This page covers the available agent types, their constructors, and how to compose them into pipelines.

Prerequisites: Get started · Next: Strategies · Memory & RAG

BaseAgent

BaseAgent is the abstract foundation all agents inherit from. You never instantiate it directly — use one of the concrete engines below.

Architecture

User Message
    |
    v
BaseAgent.chat(message)
    |
    v
[Build Messages] --> [Inject Memory Context]
    |
    v
[Check Input Guardrails]
    |
    v
[ExecutionStrategy.execute()]
    |
    +---> [_call_llm()] --> LLM Response
    |         |
    |         v
    |    Has tool_calls?
    |    YES --> [Execute Tools] --> Loop back
    |    NO  --> Return text response
    |
    v
[Check Output Guardrails]
    |
    v
Return response text

Constructor Parameters

ParameterTypeDefaultDescription
namestrrequiredAgent name (used in logging, events, admin)
modelstr"gpt-4o"LLM model identifier
system_promptstr""System message prepended to all conversations
toolslist[ToolSpec][]Tools the agent can use
temperaturefloat0.7LLM temperature
max_tokensint | NoneNoneMax output tokens per LLM call
max_iterationsint10Max tool-calling loop iterations
strategyExecutionStrategyReActStrategy()Reasoning loop strategy
memoryAnyNoneMemory backend (ContextEngine, VectorMemory, RAGEngine)
guardrailslist[Guardrail][]Input/output safety guardrails
max_context_tokensint | NoneNoneAuto-compact context when exceeded
directivesbool | DirectiveEngineNoneEnable directive preprocessing
api_basestr | NoneNoneOverride LLM API base URL
api_keystr | NoneNoneOverride LLM API key

Public Methods

chat(message: str) -> str

Send a single message and get a text response. This is the simplest entry point.

response = await agent.chat("What is quantum computing?")

chat_with_history(messages: list[ChatMessage]) -> ChatMessage

Run the agent loop against an explicit conversation history. Use this when you manage multi-turn state yourself rather than delegating to ConversationManager.

from sagewai import ChatMessage

messages = [
    ChatMessage.system("You are an expert physicist."),
    ChatMessage.user("Explain quantum entanglement"),
]
response = await agent.chat_with_history(messages)

chat_stream(message: str) -> AsyncGenerator[str, None]

Stream text chunks as they arrive. Tool calls are handled internally — only the final text content is yielded.

async for chunk in agent.chat_stream("Tell me about black holes"):
    print(chunk, end="", flush=True)

on_event(callback)

Register a listener for agent lifecycle events (run started, tool calls, errors, etc.):

from sagewai.core.events import AgentEvent

async def my_handler(event: AgentEvent, data: dict):
    print(f"Event: {event.value}, Data: {data}")

agent.on_event(my_handler)

Events emitted: RUN_STARTED, RUN_FINISHED, RUN_ERROR, RUN_CANCELLED, STEP_STARTED, STEP_FINISHED, TOOL_CALL_START, TOOL_CALL_END, TOOL_CALL_RESULT, TEXT_MESSAGE_START, TEXT_MESSAGE_CONTENT, TEXT_MESSAGE_END, GUARDRAIL_ESCALATION, CONTEXT_COMPACTED.


UniversalAgent

UniversalAgent is the standard concrete agent. It routes to 100+ LLM providers through a single interface — swap model to change the provider.

from sagewai import UniversalAgent

agent = UniversalAgent(
    name="assistant",
    model="gpt-4o",
    system_prompt="You are a helpful assistant.",
    temperature=0.7,
)

response = await agent.chat("Hello!")

Supported Models

ProviderModel ExamplesPrefix
OpenAIgpt-4o, gpt-4o-mini, o1-preview(none)
Anthropicclaude-sonnet-4-20250514, claude-3-haiku-20240307(none)
Google Geminigemini/gemini-2.0-flash, gemini/gemini-2.5-progemini/
Mistralmistral/mistral-large-latestmistral/
Coherecommand-r-plus(none)
Azure OpenAIazure/gpt-4oazure/
AWS Bedrockbedrock/anthropic.claude-3-sonnetbedrock/
Ollama (local)ollama/llama3.1, ollama/codellamaollama/
Together AItogether_ai/meta-llama/Llama-3.1-405Btogether_ai/
Groqgroq/llama-3.1-70b-versatilegroq/

Streaming

UniversalAgent streams at the token level via _stream_llm. It accumulates tool call fragments across chunks so that your async for loop only sees text content:

async for chunk in agent.chat_stream("Explain relativity"):
    print(chunk, end="")

GoogleNativeAgent

GoogleNativeAgent calls the google.genai SDK directly, bypassing the standard provider routing layer. Use it when you need Gemini-specific features that are not exposed through the unified interface.

from sagewai import GoogleNativeAgent

agent = GoogleNativeAgent(
    name="gemini-agent",
    model="gemini-2.0-flash",
    system_prompt="You are a helpful assistant.",
)

response = await agent.chat("Hello!")

When to pick GoogleNativeAgent over UniversalAgent:

  • You need native Gemini function calling format
  • You are using Vertex AI
  • You need direct access to google.genai SDK features

For everything else, use UniversalAgent with model="gemini/...".


Tools

Define tools with the @tool decorator. It converts a plain function into a ToolSpec — extracting the name, description, and parameter schema automatically:

from sagewai import tool

@tool
async def search_database(query: str, limit: int = 10) -> str:
    """Search the knowledge base for relevant documents.

    Args:
        query: The search query string.
        limit: Maximum number of results to return.
    """
    results = await db.search(query, limit=limit)
    return format_results(results)

The decorator reads:

  • Name from the function name
  • Description from the docstring
  • Parameters from type annotations
  • Handler reference for execution

Both sync and async functions work. When the LLM calls multiple tools in a single response, Sagewai executes them in parallel.

MCP Tools

Discover tools from any MCP (Model Context Protocol) server and pass them directly to an agent:

from sagewai import McpClient

# Connect via stdio
tools = await McpClient.connect(["python", "-m", "mcp_stripe"])

# Or via SSE
tools = await McpClient.connect_sse("http://localhost:8080/sse")

# Use discovered tools with any agent
agent = UniversalAgent(name="auditor", model="gpt-4o", tools=tools)

Agent Composition

Sagewai provides four workflow agents that compose sub-agents into deterministic pipelines. The workflow controls the structure; each sub-agent uses its own LLM for the actual work.

SequentialAgent

Execute sub-agents one after another. Each agent's output becomes the next agent's input:

from sagewai import UniversalAgent, SequentialAgent

researcher = UniversalAgent(
    name="researcher",
    model="gpt-4o",
    system_prompt="You research topics and return key findings.",
)
writer = UniversalAgent(
    name="writer",
    model="claude-sonnet-4-20250514",
    system_prompt="You write polished articles from research notes.",
)
reviewer = UniversalAgent(
    name="reviewer",
    model="gpt-4o-mini",
    system_prompt="You review articles for accuracy.",
)

pipeline = SequentialAgent(
    name="article-pipeline",
    agents=[researcher, writer, reviewer],
)

result = await pipeline.chat("Write about the future of quantum computing")

ParallelAgent

Run multiple agents on the same input at the same time. Their outputs are merged:

from sagewai import UniversalAgent, ParallelAgent

legal = UniversalAgent(name="legal", system_prompt="Review for legal issues.")
financial = UniversalAgent(name="financial", system_prompt="Review for financial accuracy.")
grammar = UniversalAgent(name="grammar", model="gpt-4o-mini", system_prompt="Review grammar.")

review_panel = ParallelAgent(
    name="review-panel",
    agents=[legal, financial, grammar],
)

result = await review_panel.chat("Review this contract: ...")

All agents run via asyncio.gather(). By default, results are joined with newlines. Pass a custom merge function to change that.

ConditionalAgent

Route input to a specific agent based on a condition function:

from sagewai import UniversalAgent, ConditionalAgent

escalation = UniversalAgent(name="escalation", system_prompt="Handle complaints.")
auto_reply = UniversalAgent(name="auto-reply", system_prompt="Respond helpfully.")

router = ConditionalAgent(
    name="sentiment-router",
    condition=lambda text: "negative" if "terrible" in text.lower() else "positive",
    branches={
        "negative": escalation,
        "positive": auto_reply,
    },
    default_branch=auto_reply,
)

result = await router.chat("This product is terrible!")
# Routes to the escalation agent

The condition can be sync or async. For LLM-based classification, pass an async function that calls a classifier model.

LoopAgent

Repeat a single agent until a stop condition is met or max_iterations runs out:

from sagewai import UniversalAgent, LoopAgent

refiner = UniversalAgent(
    name="refiner",
    model="gpt-4o",
    system_prompt="Improve the text. Output DONE when satisfied.",
)

loop = LoopAgent(
    name="iterative-refiner",
    agent=refiner,
    max_iterations=5,
    should_stop=lambda result, iteration: "DONE" in result,
)

result = await loop.chat("Draft: AI is good at many things...")

Agent-as-Tool

Wrap an agent as a tool so that an orchestrator can delegate to sub-agents dynamically, based on LLM reasoning rather than fixed rules:

from sagewai import UniversalAgent, agent_as_tool

researcher = UniversalAgent(name="researcher", model="gpt-4o")
writer = UniversalAgent(name="writer", model="claude-sonnet-4-20250514")

orchestrator = UniversalAgent(
    name="orchestrator",
    model="gpt-4o",
    tools=[
        agent_as_tool(researcher, description="Researches a topic thoroughly"),
        agent_as_tool(writer, description="Writes polished content"),
    ],
)

result = await orchestrator.chat("Research and write about quantum computing")

Here the orchestrator's LLM decides which sub-agents to call and in what order. This differs from SequentialAgent (fixed order) and ConditionalAgent (condition-based routing) — the LLM drives the delegation.


Choosing an Agent Pattern

PatternOrchestrationUse Case
Single UniversalAgentLLM decides everythingSimple Q&A, single-domain tasks
SequentialAgentFixed pipelineResearch -> Write -> Review
ParallelAgentFan-out, mergeMulti-perspective analysis
ConditionalAgentRule-based routingIntent classification, triage
LoopAgentIterative refinementEdit until quality threshold
agent_as_toolLLM-decided delegationDynamic multi-agent orchestration

These patterns compose. A SequentialAgent can include a ParallelAgent as one of its steps, which itself holds UniversalAgent sub-agents with different models and strategies.


What's Next

  • Strategies — Control how agents reason: ReAct, Tree of Thoughts, LATS, Planning
  • Memory & RAG — Add long-term memory with vector, graph, and hybrid retrieval
  • Workflows — Durable execution with checkpointing, human approval, and distributed workers