Execution modes

Sagewai workflow steps run in one of five execution modes. Picking the right mode per step is what makes Sagewai both efficient and safe. Modes are per-step, not per-deployment. A single workflow run can plan inline on the worker, build a site in a fully isolated sandbox with a CLI agent, and summarise the result inline again — each step chooses the cheapest mode that still satisfies the isolation and capability it actually needs. This page fixes the five modes, defines when to pick each one, and walks through a worked example end-to-end.

The five modes at a glance

ModeWorkerSandboxIdentityCLI agent(s)Artifact destBest for
0 — BarePure orchestration: planning, summarising, simple Q&A, lightweight transforms
1 — SandboxedUntrusted tool execution with no per-customer creds (e.g. run user's Python snippet)
2 — IdentityoptionalTool execution with customer creds (read their S3, query their DB, call their API)
3 — FullReal user task: build a website, fix a bug in a repo, generate a report and push it
3b — Full + JIT callback✓ + callbackLike 3, but CLI agent can request credentials it doesn't have at runtime

The mode is selected per workflow step, not once for the whole workflow. A workflow that "reads a brief, builds a site, summarises what it did" will typically be Mode 0 → Mode 3 → Mode 0.

Mode 0 — Bare

The simplest mode. The worker process executes the step directly; no isolation, no sandbox, no identity injection.

Topology:

Worker process
  └── Sagewai Agent runs the step inline
        └── reads inputs from postgres
        └── calls Tier-1 LLM if needed (worker env keys)
        └── writes output to postgres

Cost: essentially free (no container start, no network bridging, no env injection).

Security: the step has full access to the worker host process. Anything the worker can do, the step can do. Use only for code Sagewai itself trusts — never for code the user wrote or untrusted CLI invocations.

Examples:

  • Workflow step "summarise the previous step's output in plain English" — reads postgres, calls Tier-1 LLM, writes summary.
  • Step "decide which Mode 3 CLI to dispatch next based on input shape" — pure planning.
  • Step "validate the user's input against a JSON schema" — no IO, no LLM.

When NOT to use Mode 0:

  • The step calls user-provided code, scripts, or shell commands.
  • The step needs access to customer credentials.
  • The step calls external APIs that should be rate-limited or audited per customer.

Mode 1 — Sandboxed (no identity)

The step runs in a sandbox container with empty env. Useful for executing code Sagewai doesn't trust, but where no customer credentials are needed.

Topology:

Worker process
  └── Sagewai Agent
        └── acquires sandbox (Docker/K8s/Lambda) with EMPTY env
        └── dispatches tool execution via tool runner RPC
        └── writes output to postgres

Cost: sandbox start (~2-8s cold; ~50-100ms warm with the sandbox warm-pool).

Security: isolation is the trust boundary. Network policy applies (NONE / EGRESS_ONLY / FULL). Filesystem is container-local; nothing persists by default.

Examples:

  • Run a user-supplied Python snippet, return the result.
  • Execute an unverified shell command from a workflow input.
  • Run a code linter or formatter on user-provided code.
  • Generate a thumbnail with imagemagick — needs isolation in case of a bad input file, but no credentials.

Identity is empty: the sandbox has no Sealed env injected (Sealed is the Sagewai credential subsystem; see Security tiers). Tool runner reads only standard container env (PATH, HOME). Useful when you want isolation for safety, not for credential scoping.

Mode 2 — Identity (no CLI agent)

Mode 1 plus a Sealed Identity is injected into the sandbox env. Tools running inside can read customer credentials and behavior knobs.

Topology:

Worker process
  └── Sagewai Agent
        ├── resolves Sealed cascade (system + workflow + user)
        │   re-resolves at sandbox-start time (drift detection)
        ├── acquires sandbox; backend injects env from cascade
        └── dispatches tool execution via tool runner RPC
              └── tools read os.environ for creds

Cost: Mode 1 cost + Sealed cascade resolution (~10-30ms for typical 1-3 level cascade against builtin backend; ~100ms+ for external backends like Vault).

Security: all of Mode 1 plus per-customer credential scoping. Audit trail captures every key injected (profile.injected), every cascade resolution (profile.cascade.resolved), every revocation interaction. No CLI agent → no LLM keys leave the sandbox unless a generic tool calls one.

Examples:

  • Step "query customer's PostgreSQL database, return summary" — needs CUSTOMER_DB_URL from Sealed.
  • Step "fetch from customer's S3 bucket and process" — needs AWS_ACCESS_KEY_ID/SECRET.
  • Step "call customer's internal API with their auth token" — needs CUSTOMER_API_TOKEN.

When to choose Mode 2 over Mode 3: when the work is deterministic and well-bounded. CLI agents (Mode 3) are for open-ended tasks where the LLM needs to make choices about what to do. Mode 2 is for "execute this specific operation with the right credentials."

Mode 3 — Full (CLI agent)

Mode 2 plus the tool runner spawns a CLI agent (Claude Code, Codex, Gemini, custom) inside the sandbox. The CLI does the actual work; Sagewai Agent on host orchestrates.

This is the mode the Quickstart demonstrates.

Topology:

Worker process
  └── Sagewai Agent (host)
        ├── decides which CLI to invoke + with what prompt
        ├── acquires sandbox with Identity (Tier-2 keys + artifact creds)
        ├── dispatches via tool runner RPC:
        │     "claude-code-cli run --prompt='build portfolio site'
        │      --workdir=/workspace"
        └── streams stdout/stderr back from sandbox

Sandbox
  ├── Identity env: ANTHROPIC_API_KEY, GITHUB_TOKEN, …
  ├── tool runner spawns CLI as subprocess
  └── CLI:
        ├── reads ANTHROPIC_API_KEY from os.environ
        ├── calls Anthropic API directly (network egress from sandbox)
        ├── edits files in /workspace
        └── on completion: pushes to artifact destination
              (git push using GITHUB_TOKEN, or aws s3 sync, or cp)

Cost: Mode 2 cost + CLI agent runtime (varies — Claude Code on a complex prompt can take 5-30 minutes). LLM API costs are charged to the CLI's Tier-2 key, not Sagewai's Tier-1.

Security: the CLI runs inside the sandbox boundary. Its LLM keys are sandbox-only. Its filesystem access is /workspace (or wherever the image variant configures). Network policy applies — typically EGRESS_ONLY to LLM provider + artifact destination.

Examples:

  • "Build a portfolio website from this brief" → Claude Code
  • "Refactor this Python module for performance" → Codex or Claude Code
  • "Generate a marketing landing page in Next.js" → Gemini CLI or Claude Code
  • "Run linters + tests + open a PR with fixes" → Codex

Image variants matter here. The sandbox image determines which CLI agents are available. The variant catalog (sagewai/sandbox-claude-code, sagewai/sandbox-multi, etc.) is operator-curated; workflows declare which variant they need.

Artifact destinations are first-class. Mode 3 implies output. Where the CLI's work goes is configured per-workflow:

  • githubgit push to a target repo, using GITHUB_TOKEN from Identity
  • s3aws s3 sync /workspace s3://bucket/path, using AWS creds from Identity
  • localcp /workspace /host-mounted/path, target path passed by worker
  • none — destination is the workflow output (/workspace is read by Sagewai Agent and persisted to postgres on step completion)

Mode 3b — Full with JIT credential callback

Mode 3b is for open-ended workflows where you cannot enumerate all credentials at enqueue time. The CLI agent or tool runner can request credentials it doesn't have at runtime, and the Sagewai Agent on host evaluates against policy (auto-approve, deny, HITL).

This mode is the just-in-time callback channel territory. The tool runner RPC channel becomes bidirectional: in addition to host→sandbox dispatch, sandbox→host credential requests are honoured.

Topology (additional flow on top of Mode 3):

Sandbox (CLI agent or tool runner)
  └── needs credential X not in current env
        ↓ callback
        request_credential(name="ANOTHER_REPO_TOKEN",
                           scope="github:push",
                           reason="user said push to repo Y")
        ↓
Worker process: Sagewai Agent
  └── policy engine evaluates request:
        ├── auto-approve (matches a policy rule)
        ├── deny (forbidden by policy)
        └── HITL (requires operator approval via admin UI)
        ↓ if approved:
  └── Sealed cascade lookup or dynamic creation
        ↓ inject value into running sandbox env
  └── audit emit: credential.requested → approved → delivered

Cost: Mode 3 cost + per-callback round-trip (sandbox ↔ host RPC) + policy evaluation (~ms) or HITL approval delay (operator-bounded).

Security: every callback is audit-logged. The injection mechanism (e.g., docker exec env-set, k8s pod ENV update via projected secret reload, Lambda env update via reinvocation) limits blast radius — the new credential is in this sandbox only, not propagated to siblings in the pool.

Examples:

  • Claude Code is asked to push to a repo not pre-configured. Requests GITHUB_TOKEN_REPO_X. Operator policy: "auto-approve writes to repos under our org." → injected.
  • Codex needs to call a third-party API the operator hasn't pre-authorised. Requests THIRD_PARTY_API_KEY. Policy: "HITL required for new API keys." → admin UI surfaces the request; operator approves; injected.
  • A tool needs to escalate to a more powerful AWS role. Requests AWS_ROLE_ARN_X. Policy: "deny escalations to admin roles." → denied.

When to use 3b over 3: when the workflow is open-ended enough that you can't enumerate all credentials at enqueue time. For closed workflows ("build a site, push to repo X"), use Mode 3 with X's creds pre-set.

Decision tree: which mode for which step?

Is the step pure orchestration (planning, summarising, validation)?
  └── YES → Mode 0
  └── NO ↓

Does the step need customer-specific credentials?
  ├── NO → Does the step run untrusted code or call shell/tools?
  │         ├── NO  → Mode 0 (just trusted Sagewai code)
  │         └── YES → Mode 1
  └── YES ↓

Does the step invoke a CLI agent (Claude Code, Codex, Gemini, …)?
  ├── NO  → Mode 2
  └── YES ↓

Are all credentials needed at start known at enqueue time?
  ├── YES → Mode 3
  └── NO  → Mode 3b

Cost, security, and capability trade-offs

       ┌────────────────────────────────────────────────────┐
       │                                                    │
       │  Mode 0 ────────────────────────────────► Mode 3b  │
       │                                                    │
       │  cheaper                              more capable │
       │  faster                            more isolated   │
       │  less secure                       more auditable  │
       │  fewer features                    more complex    │
       │                                                    │
       └────────────────────────────────────────────────────┘
DimensionMode 0Mode 1Mode 2Mode 3Mode 3b
Step latency overhead~mscontainer start (sec; <100ms pooled)+ cascade resolution (~tens of ms)+ CLI startup (varies)+ callback latency
LLM cost paid byTier-1 (operator)n/a (no LLM by default)n/a or operator-side toolsTier-2 (customer)Tier-2 (customer)
Customer credentials accessiblenonoyes (env-injected)yes (env-injected)yes (env + JIT)
Network isolationhost networksandbox network policysandbox network policysandbox network policysandbox network policy
Filesystem isolationhost fscontainer fscontainer fs + identity+ /workspace + artifact dest+ JIT creds
Sealed audit coveragen/an/afullfullfull + callback events
Replay-determinism (the replay-safety contract)trivial (pure code)sandbox env = empty (trivial)replay original Identity valuesreplay original Identity + CLI prompts/responsesreplay original + cached callback results

Mixing modes within a workflow

Real workflows almost always mix modes. Each @workflow.step(...) decorator picks its own mode independently.

@workflow.step("plan")          # Mode 0 implicit
async def plan(brief: str) -> dict:
    """Pure orchestration step — runs inline on worker."""
    return await sagewai_agent.plan(brief)

@workflow.step(
    "build_site",
    sandbox_mode=SandboxMode.PER_RUN,
    security_profile_ref="customer-portfolio",
    cli_agent="claude-code",
)                                # Mode 3 explicit
async def build_site(plan: dict) -> str:
    """Mode 3 step — CLI agent, identity, artifact dest."""
    return await dispatch_cli(plan)

@workflow.step("summarise")      # Mode 0 implicit
async def summarise(artifact_path: str) -> str:
    """Mode 0 — orchestration."""
    return await sagewai_agent.summarise_changes(artifact_path)

The decorator API surface may evolve as ergonomics improve. The mode-per-step contract is stable.

Worked example: build a customer's portfolio site

A typical Mode 0 → Mode 3 → Mode 0 pipeline:

Step 1 — receive_brief                                       Mode 0
  Input: customer's natural-language brief
  Reads: workflow input
  Tier-1 LLM call: "extract structured requirements"
  Output: JSON {style, sections, target_repo, ...}
  Cost: ~500ms, Tier-1 LLM tokens (cheap planning model)

Step 2 — scaffold                                            Mode 3
  Input: requirements JSON, target repo URL
  Acquires: sandbox (Mode 3, image variant claude-code)
  Identity: profile "portfolio-customer-X" injected:
    ANTHROPIC_API_KEY (Claude Code uses)
    GITHUB_TOKEN (push artifact)
  Tool runner spawns Claude Code:
    claude-code run --prompt="scaffold Next.js site per JSON brief"
  Claude Code calls Anthropic, edits /workspace
  On completion: git push origin main → customer's repo
  Cost: ~5-15min, Tier-2 LLM tokens (Claude Sonnet)

Step 3 — verify                                              Mode 1
  Input: target repo URL (just-pushed commit SHA)
  Acquires: sandbox (Mode 1, image variant base — no creds needed)
  Tool runner runs:
    git clone <url> /tmp/check
    cd /tmp/check && npm ci && npm run build
  Returns: {build: "ok"} or {build: "failed", error: "..."}
  Cost: ~30-60s, no LLM
  No identity needed — the repo is public-readable for the build.

Step 4 — summarise                                           Mode 0
  Input: build result
  Tier-1 LLM call: "write a one-paragraph completion message"
  Output: human-friendly status to webhook / Slack
  Cost: ~500ms, Tier-1 LLM tokens

Total: 1 customer LLM bill (Step 2), 0 customer credentials touched outside the sandbox, 4 audit events (profile.cascade.resolved, profile.injected, secret.decrypted × N, pool.sandbox.reset).

This is the workflow the Quickstart walks through end-to-end.

Anti-patterns

  1. One mode for the whole workflow. Setting sandbox_mode=PER_RUN at the workflow level forces every step into the sandbox, making the cheap orchestration steps (Mode 0) needlessly expensive.

  2. Mode 3 for a Mode 2 task. If you don't need the CLI agent's open-ended decision-making, don't pay for it. A scheduled "fetch S3, transform CSV, write to DB" job is Mode 2, not Mode 3.

  3. Mode 1 with credentials in workflow input. "Pass the API key as a step argument" defeats Sealed entirely. Credentials always flow through Identity, never through workflow inputs.

  4. Mode 3b without policy. Mode 3b's whole point is that the host-side policy decides what's allowed. If your "policy" is "auto-approve everything," you have Mode 3 with extra steps and an attack surface.

  5. Mode 0 for code Sagewai doesn't write. The worker host is the only thing trusted in Mode 0. User-provided shell commands, plugin code, scripts — all belong in Mode 1+.

Cross-references