Agents
An agent is Sagewai's core building block: it wraps an LLM with a tool-calling loop. You send a message; the LLM responds (optionally requesting tool calls); tools execute; the loop continues until the LLM returns a final text response. This page covers the available agent types, their constructors, and how to compose them into pipelines.
Prerequisites: Get started · Next: Strategies · Memory & RAG
BaseAgent
BaseAgent is the abstract foundation all agents inherit from. You never instantiate it directly — use one of the concrete engines below.
Architecture
User Message
|
v
BaseAgent.chat(message)
|
v
[Build Messages] --> [Inject Memory Context]
|
v
[Check Input Guardrails]
|
v
[ExecutionStrategy.execute()]
|
+---> [_call_llm()] --> LLM Response
| |
| v
| Has tool_calls?
| YES --> [Execute Tools] --> Loop back
| NO --> Return text response
|
v
[Check Output Guardrails]
|
v
Return response text
Constructor Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
name | str | required | Agent name (used in logging, events, admin) |
model | str | "gpt-4o" | LLM model identifier |
system_prompt | str | "" | System message prepended to all conversations |
tools | list[ToolSpec] | [] | Tools the agent can use |
temperature | float | 0.7 | LLM temperature |
max_tokens | int | None | None | Max output tokens per LLM call |
max_iterations | int | 10 | Max tool-calling loop iterations |
strategy | ExecutionStrategy | ReActStrategy() | Reasoning loop strategy |
memory | Any | None | Memory backend (ContextEngine, VectorMemory, RAGEngine) |
guardrails | list[Guardrail] | [] | Input/output safety guardrails |
max_context_tokens | int | None | None | Auto-compact context when exceeded |
directives | bool | DirectiveEngine | None | Enable directive preprocessing |
api_base | str | None | None | Override LLM API base URL |
api_key | str | None | None | Override LLM API key |
Public Methods
chat(message: str) -> str
Send a single message and get a text response. This is the simplest entry point.
response = await agent.chat("What is quantum computing?")
chat_with_history(messages: list[ChatMessage]) -> ChatMessage
Run the agent loop against an explicit conversation history. Use this when you manage multi-turn state yourself rather than delegating to ConversationManager.
from sagewai import ChatMessage
messages = [
ChatMessage.system("You are an expert physicist."),
ChatMessage.user("Explain quantum entanglement"),
]
response = await agent.chat_with_history(messages)
chat_stream(message: str) -> AsyncGenerator[str, None]
Stream text chunks as they arrive. Tool calls are handled internally — only the final text content is yielded.
async for chunk in agent.chat_stream("Tell me about black holes"):
print(chunk, end="", flush=True)
on_event(callback)
Register a listener for agent lifecycle events (run started, tool calls, errors, etc.):
from sagewai.core.events import AgentEvent
async def my_handler(event: AgentEvent, data: dict):
print(f"Event: {event.value}, Data: {data}")
agent.on_event(my_handler)
Events emitted: RUN_STARTED, RUN_FINISHED, RUN_ERROR, RUN_CANCELLED, STEP_STARTED, STEP_FINISHED, TOOL_CALL_START, TOOL_CALL_END, TOOL_CALL_RESULT, TEXT_MESSAGE_START, TEXT_MESSAGE_CONTENT, TEXT_MESSAGE_END, GUARDRAIL_ESCALATION, CONTEXT_COMPACTED.
UniversalAgent
UniversalAgent is the standard concrete agent. It routes to 100+ LLM providers through a single interface — swap model to change the provider.
from sagewai import UniversalAgent
agent = UniversalAgent(
name="assistant",
model="gpt-4o",
system_prompt="You are a helpful assistant.",
temperature=0.7,
)
response = await agent.chat("Hello!")
Supported Models
| Provider | Model Examples | Prefix |
|---|---|---|
| OpenAI | gpt-4o, gpt-4o-mini, o1-preview | (none) |
| Anthropic | claude-sonnet-4-20250514, claude-3-haiku-20240307 | (none) |
| Google Gemini | gemini/gemini-2.0-flash, gemini/gemini-2.5-pro | gemini/ |
| Mistral | mistral/mistral-large-latest | mistral/ |
| Cohere | command-r-plus | (none) |
| Azure OpenAI | azure/gpt-4o | azure/ |
| AWS Bedrock | bedrock/anthropic.claude-3-sonnet | bedrock/ |
| Ollama (local) | ollama/llama3.1, ollama/codellama | ollama/ |
| Together AI | together_ai/meta-llama/Llama-3.1-405B | together_ai/ |
| Groq | groq/llama-3.1-70b-versatile | groq/ |
Streaming
UniversalAgent streams at the token level via _stream_llm. It accumulates tool call fragments across chunks so that your async for loop only sees text content:
async for chunk in agent.chat_stream("Explain relativity"):
print(chunk, end="")
GoogleNativeAgent
GoogleNativeAgent calls the google.genai SDK directly, bypassing the standard provider routing layer. Use it when you need Gemini-specific features that are not exposed through the unified interface.
from sagewai import GoogleNativeAgent
agent = GoogleNativeAgent(
name="gemini-agent",
model="gemini-2.0-flash",
system_prompt="You are a helpful assistant.",
)
response = await agent.chat("Hello!")
When to pick GoogleNativeAgent over UniversalAgent:
- You need native Gemini function calling format
- You are using Vertex AI
- You need direct access to
google.genaiSDK features
For everything else, use UniversalAgent with model="gemini/...".
Tools
Define tools with the @tool decorator. It converts a plain function into a ToolSpec — extracting the name, description, and parameter schema automatically:
from sagewai import tool
@tool
async def search_database(query: str, limit: int = 10) -> str:
"""Search the knowledge base for relevant documents.
Args:
query: The search query string.
limit: Maximum number of results to return.
"""
results = await db.search(query, limit=limit)
return format_results(results)
The decorator reads:
- Name from the function name
- Description from the docstring
- Parameters from type annotations
- Handler reference for execution
Both sync and async functions work. When the LLM calls multiple tools in a single response, Sagewai executes them in parallel.
MCP Tools
Discover tools from any MCP (Model Context Protocol) server and pass them directly to an agent:
from sagewai import McpClient
# Connect via stdio
tools = await McpClient.connect(["python", "-m", "mcp_stripe"])
# Or via SSE
tools = await McpClient.connect_sse("http://localhost:8080/sse")
# Use discovered tools with any agent
agent = UniversalAgent(name="auditor", model="gpt-4o", tools=tools)
Agent Composition
Sagewai provides four workflow agents that compose sub-agents into deterministic pipelines. The workflow controls the structure; each sub-agent uses its own LLM for the actual work.
SequentialAgent
Execute sub-agents one after another. Each agent's output becomes the next agent's input:
from sagewai import UniversalAgent, SequentialAgent
researcher = UniversalAgent(
name="researcher",
model="gpt-4o",
system_prompt="You research topics and return key findings.",
)
writer = UniversalAgent(
name="writer",
model="claude-sonnet-4-20250514",
system_prompt="You write polished articles from research notes.",
)
reviewer = UniversalAgent(
name="reviewer",
model="gpt-4o-mini",
system_prompt="You review articles for accuracy.",
)
pipeline = SequentialAgent(
name="article-pipeline",
agents=[researcher, writer, reviewer],
)
result = await pipeline.chat("Write about the future of quantum computing")
ParallelAgent
Run multiple agents on the same input at the same time. Their outputs are merged:
from sagewai import UniversalAgent, ParallelAgent
legal = UniversalAgent(name="legal", system_prompt="Review for legal issues.")
financial = UniversalAgent(name="financial", system_prompt="Review for financial accuracy.")
grammar = UniversalAgent(name="grammar", model="gpt-4o-mini", system_prompt="Review grammar.")
review_panel = ParallelAgent(
name="review-panel",
agents=[legal, financial, grammar],
)
result = await review_panel.chat("Review this contract: ...")
All agents run via asyncio.gather(). By default, results are joined with newlines. Pass a custom merge function to change that.
ConditionalAgent
Route input to a specific agent based on a condition function:
from sagewai import UniversalAgent, ConditionalAgent
escalation = UniversalAgent(name="escalation", system_prompt="Handle complaints.")
auto_reply = UniversalAgent(name="auto-reply", system_prompt="Respond helpfully.")
router = ConditionalAgent(
name="sentiment-router",
condition=lambda text: "negative" if "terrible" in text.lower() else "positive",
branches={
"negative": escalation,
"positive": auto_reply,
},
default_branch=auto_reply,
)
result = await router.chat("This product is terrible!")
# Routes to the escalation agent
The condition can be sync or async. For LLM-based classification, pass an async function that calls a classifier model.
LoopAgent
Repeat a single agent until a stop condition is met or max_iterations runs out:
from sagewai import UniversalAgent, LoopAgent
refiner = UniversalAgent(
name="refiner",
model="gpt-4o",
system_prompt="Improve the text. Output DONE when satisfied.",
)
loop = LoopAgent(
name="iterative-refiner",
agent=refiner,
max_iterations=5,
should_stop=lambda result, iteration: "DONE" in result,
)
result = await loop.chat("Draft: AI is good at many things...")
Agent-as-Tool
Wrap an agent as a tool so that an orchestrator can delegate to sub-agents dynamically, based on LLM reasoning rather than fixed rules:
from sagewai import UniversalAgent, agent_as_tool
researcher = UniversalAgent(name="researcher", model="gpt-4o")
writer = UniversalAgent(name="writer", model="claude-sonnet-4-20250514")
orchestrator = UniversalAgent(
name="orchestrator",
model="gpt-4o",
tools=[
agent_as_tool(researcher, description="Researches a topic thoroughly"),
agent_as_tool(writer, description="Writes polished content"),
],
)
result = await orchestrator.chat("Research and write about quantum computing")
Here the orchestrator's LLM decides which sub-agents to call and in what order. This differs from SequentialAgent (fixed order) and ConditionalAgent (condition-based routing) — the LLM drives the delegation.
Choosing an Agent Pattern
| Pattern | Orchestration | Use Case |
|---|---|---|
Single UniversalAgent | LLM decides everything | Simple Q&A, single-domain tasks |
SequentialAgent | Fixed pipeline | Research -> Write -> Review |
ParallelAgent | Fan-out, merge | Multi-perspective analysis |
ConditionalAgent | Rule-based routing | Intent classification, triage |
LoopAgent | Iterative refinement | Edit until quality threshold |
agent_as_tool | LLM-decided delegation | Dynamic multi-agent orchestration |
These patterns compose. A SequentialAgent can include a ParallelAgent as one of its steps, which itself holds UniversalAgent sub-agents with different models and strategies.
What's Next
- Strategies — Control how agents reason: ReAct, Tree of Thoughts, LATS, Planning
- Memory & RAG — Add long-term memory with vector, graph, and hybrid retrieval
- Workflows — Durable execution with checkpointing, human approval, and distributed workers