Production patterns
These are the patterns you reach when you've outgrown the SDK basics but aren't yet reading a full end-to-end tutorial. Two kinds of content live here: reference examples (200–400 lines, one non-trivial problem each) and orchestration recipes (the topologies that appear in almost every production system). Read what matches your current problem.
Reference examples
SDK patterns
| # | Example | Problem solved |
|---|---|---|
| 19 | domain_model | Domain-modelled agent state |
| 21 | full_stack | End-to-end SDK + workflow + fleet |
| 29 | memory_strategies | Strategy-based memory extraction |
| 31 | grounded_multi_model | Multi-LLM grounded retrieval |
| 32 | global_shared_memory | Cross-agent shared knowledge |
Autopilot patterns
| # | Example | Problem solved |
|---|---|---|
| 35 | autopilot_hosted_service | Autopilot via the hosted blueprint service |
| 28 | autopilot_quickstart | Autopilot without an LLM key |
Fleet patterns
| # | Example | Problem solved |
|---|---|---|
| 20 | fleet_workers | Distributed worker registration |
| 26 | fleet_scoped_dispatch | Capability-based dispatch, per-project routing |
Observatory patterns
| # | Example | Problem solved |
|---|---|---|
| 12 | budget_enforcement | Per-user/team/project budget caps |
Training Loop patterns
| # | Example | Problem solved |
|---|---|---|
| 25 | training_data_pipeline | Curator-style data capture |
Sealed patterns
| # | Example | Problem solved |
|---|---|---|
| 16 | agent_governance | Approval flows + audit trail |
Companion notes
- Example 22 — memory_holds_across_llms — covers the memory + LLM-swap pattern combination.
- Example 17 —
unsloth_finetune(deprecated) — superseded by Example 38. Left in place for backward compatibility.
Orchestration recipes
These topologies solve distinct structural problems. Pick the one that matches how your task is shaped. Patterns compose — a pipeline of memory-augmented agents with an approval gate in the middle is a valid and common production architecture.
ReAct Loop
The default execution mode for all Sagewai agents. The agent alternates between reasoning (deciding what to do) and acting (calling tools), looping until it produces a final answer.
When to use: any task that requires tool use, multi-step reasoning, or retrieval.
from sagewai import UniversalAgent, tool
@tool
async def web_search(query: str) -> str:
"""Search the web."""
return f"Results for: {query}"
agent = UniversalAgent(
name="researcher",
model="gpt-4o",
tools=[web_search],
)
result = await agent.chat("What happened in AI this week?")
Control the maximum iterations with AgentConfig(max_iterations=10).
Supervisor / Subordinate
A supervisor agent breaks work into subtasks and delegates to worker agents. The supervisor synthesises results into a final response.
When to use: complex tasks with clearly separable subtasks (research + write + edit), or when different tasks need different models.
from sagewai import UniversalAgent, agent_as_tool
researcher = UniversalAgent(name="researcher", model="gpt-4o-mini", tools=[web_search])
writer = UniversalAgent(name="writer", model="claude-3-5-sonnet-20241022")
supervisor = UniversalAgent(
name="supervisor",
model="gpt-4o",
tools=[
agent_as_tool(researcher, description="Search and gather facts"),
agent_as_tool(writer, description="Write polished prose from notes"),
],
system_prompt="Break the task into research and writing, then combine the results.",
)
result = await supervisor.chat("Write a report on renewable energy trends")
Fan-out / Fan-in
Run multiple agents in parallel, then merge their outputs. Reduces total latency when subtasks are independent.
When to use: competitive analysis, multi-perspective review, generating multiple options simultaneously.
import asyncio
from sagewai import UniversalAgent
critic_a = UniversalAgent(name="critic-a", model="gpt-4o", system_prompt="Analyze from a risk perspective.")
critic_b = UniversalAgent(name="critic-b", model="gpt-4o", system_prompt="Analyze from an opportunity perspective.")
critic_c = UniversalAgent(name="critic-c", model="claude-3-5-sonnet-20241022", system_prompt="Analyze from a technical perspective.")
proposal = "Launch a new AI-powered customer service platform."
# Fan-out: run all three in parallel
results = await asyncio.gather(
critic_a.chat(proposal),
critic_b.chat(proposal),
critic_c.chat(proposal),
)
# Fan-in: synthesizer merges
synthesizer = UniversalAgent(name="synthesizer", model="gpt-4o")
merged = await synthesizer.chat(
f"Synthesize these three analyses into an executive summary:\n\n"
+ "\n\n---\n\n".join(results)
)
Pipeline
A sequential chain where each agent's output feeds directly into the next. Use SequentialAgent for clean composition.
When to use: article generation (research → draft → edit), data processing (extract → transform → summarise), multi-stage validation.
from sagewai import UniversalAgent, SequentialAgent
researcher = UniversalAgent(name="researcher", model="gpt-4o-mini", tools=[web_search])
writer = UniversalAgent(name="writer", model="claude-3-5-sonnet-20241022")
editor = UniversalAgent(name="editor", model="gpt-4o")
pipeline = SequentialAgent(
name="content-pipeline",
agents=[researcher, writer, editor],
)
final = await pipeline.chat("Write a blog post about the future of remote work")
Data flows left to right: researcher output → writer input → editor input → final result.
Approval Gate
Pause a workflow until a human reviews and approves. The workflow checkpoints its state; a human (via admin panel, CLI, or API) resumes it.
When to use: sensitive operations (financial transactions, production deployments, content publication), compliance workflows, PII handling.
from sagewai import ApprovalGate, DurableWorkflow
from sagewai.core.state import InMemoryStore, WorkflowWaiting
store = InMemoryStore()
workflow = DurableWorkflow(name="publish-pipeline", store=store)
gate = ApprovalGate(workflow=workflow)
@workflow.step("draft")
async def draft(topic: str) -> str:
return await writer.chat(f"Write about: {topic}")
@workflow.step("approve")
async def approve(content: str) -> str:
await gate.request_approval(prompt=f"Approve: {content[:80]}...")
return content # continues after approval
@workflow.step("publish")
async def publish(content: str) -> str:
return f"[PUBLISHED] {content}"
run_id = "run-001"
try:
await workflow.run(run_id=run_id, topic="AI safety")
except WorkflowWaiting:
print("Waiting for human approval...")
# From admin panel or CLI — or call gate.approve(run_id) programmatically
await gate.approve(run_id=run_id)
result = await workflow.run(run_id=run_id, topic="AI safety")
Self-Correcting Agent
The agent retries on failure with a reflection step: it examines its error and adjusts its approach before the next attempt.
When to use: unreliable tool calls (network errors, rate limits), tasks that require iterative refinement, code generation with execution feedback.
from sagewai import UniversalAgent
from sagewai.core.self_correction import SelfCorrectionStrategy
agent = UniversalAgent(
name="coder",
model="gpt-4o",
strategy=SelfCorrectionStrategy(max_corrections=3),
tools=[run_code],
)
result = await agent.chat("Write a Python function to parse ISO 8601 dates")
# If the generated code raises an exception, the agent reads the error,
# reflects on what went wrong, and regenerates — up to 3 times.
Memory-Augmented Agent
An agent wired to a ContextEngine that retrieves relevant knowledge before each response and stores new facts after each conversation.
When to use: agents that need long-term memory, domain knowledge, or must learn from past interactions.
from sagewai import UniversalAgent
from sagewai.context import ContextEngine, ContextScope, InMemoryMetadataStore, InMemoryVectorStore
from sagewai.intelligence import HashEmbedder
engine = ContextEngine(
metadata_store=InMemoryMetadataStore(),
vector_store=InMemoryVectorStore(),
embedder=HashEmbedder(dimension=384),
project_id="my-project",
)
# Pre-load domain knowledge
await engine.ingest_text(
text="Our API rate limit is 1000 requests/minute per project.",
title="API Limits",
scope=ContextScope.PROJECT,
scope_id="my-project",
)
# Agent retrieves context automatically on each chat call
agent = UniversalAgent(
name="support-agent",
model="gpt-4o",
memory=engine, # wires ContextEngine as memory provider
)
result = await agent.chat("What are the API rate limits?")
Enable auto-learn to extract and store facts from every conversation:
from sagewai.context import MemoryBridge
from sagewai.intelligence import RuleBasedFactExtractor
bridge = MemoryBridge(context_engine=engine, fact_extractor=RuleBasedFactExtractor())
# After each conversation, call bridge.extract_from_conversation(messages, ...)
# to persist learned facts for future sessions.
Choosing a pattern
| Pattern | Latency | Complexity | Best For |
|---|---|---|---|
| ReAct Loop | Medium | Low | Most tasks |
| Supervisor | High | Medium | Heterogeneous subtasks |
| Fan-out / Fan-in | Low (parallel) | Medium | Independent analyses |
| Pipeline | Medium | Low | Linear multi-stage processing |
| Approval Gate | Variable | Low | Compliance, sensitive ops |
| Self-Correcting | High | Low | Unreliable tools, code gen |
| Memory-Augmented | Medium | Medium | Long-running, domain-specific |
See also
- Tutorials — end-to-end walkthroughs that compose these patterns into complete production scenarios.
- Learn the SDK — shorter single-capability examples that precede these patterns.
- Platform — capability deep-dives.
- Reference — examples — the full numbered example list.