Security tiers

Sagewai routes LLM keys through two separate surfaces. Tier 1 keys are the operator's own — used by the Sagewai Agent on the worker host to plan and dispatch work. Tier 2 keys belong to your customers — used inside the sandbox by the CLI agents and tools that do the user-facing work. The two tiers have different lifetimes, different sources, and different trust boundaries; mixing them defeats the role separation the rest of the platform depends on.

This page covers where each tier lives, how to configure it, how it's audited, and what you as the operator are responsible for versus what Sagewai handles.

Before you start

You should already understand:

If you're brand-new to Sagewai, read the getting-started guide first; this page assumes a working install.

Tier 1 — Orchestration keys

The Sagewai Agent on each worker makes its own LLM calls (planning, tool selection, prompt routing). Tier 1 keys power those calls. They are:

  • Long-lived. Set once when the worker process starts.
  • Operator-supplied. They come from your infra: an env file, a Kubernetes Secret, a vault sidecar — whatever your platform team already uses.
  • Worker-scoped. They live in worker process memory and never leave it. The control plane never sees them; the sandbox never sees them.
  • Cheap-model friendly. The orchestration brain doesn't need a frontier model. A small/cheap/local model (Ollama Mistral, Claude Haiku, GPT-4o-mini) is usually the right pick.

A typical worker env:

ORCHESTRATION_OPENAI_KEY=sk-...
ORCHESTRATION_ANTHROPIC_KEY=sk-ant-...
ORCHESTRATION_OLLAMA_URL=http://localhost:11434

Tier 1 keys are your standard infra-secrets concern. Sagewai does not encrypt them, rotate them, or audit them — that's your existing infra's job.

Tier 2 — Customer-task keys

Tier 2 keys are everything the CLI agents and tools inside the sandbox need to do the actual work: Anthropic / OpenAI / Gemini API keys for the CLI agents themselves, GitHub tokens for git push, AWS credentials for aws s3 sync, and so on. They are:

  • Short-lived. Injected when the sandbox container starts; scrubbed when it's released back to the pool.
  • Customer-scoped. Sourced from a Sealed Identity profile, resolved at enqueue time and re-resolved at sandbox-start to catch rotation drift.
  • Sandbox-only. Plaintext values exist only in the sandbox container's os.environ, only for the run's lifetime. They never touch the worker process or the control plane in cleartext.

A typical Identity profile mixes secret keys with non-secret behavior knobs:

ANTHROPIC_API_KEY=sk-ant-…   ← Claude Code uses this
OPENAI_API_KEY=sk-…          ← Codex uses this
GEMINI_API_KEY=…             ← Gemini CLI uses this
GITHUB_TOKEN=ghp_…           ← git push uses this
AWS_ACCESS_KEY_ID=…          ← aws s3 sync uses this
AWS_SECRET_ACCESS_KEY=…
DEBUG=1                      ← behavior knob, not a secret
MAX_TOKENS=8000              ← behavior knob

Tier 2 is governed end-to-end by Sagewai's Sealed Identity layer: profile management, vault-backed storage, revocation, redaction, per-key access control, and just-in-time human-in-the-loop approvals all apply to this tier and only this tier.

Visual: who sees what

Loading diagram...

Operator vs end customer

Sagewai has two human roles in the credential model. The split is logical, not organisational — single-operator installs still benefit from the separation, because Tier 1 ("my Ollama URL") and Tier 2 ("the API keys for the customer project I'm building") have different lifetimes and risk profiles.

RoleOwnsConfiguresAudited via
Operatorthe worker fleet, the control plane, Tier-1 keysworker env, autopilot config, Sealed system-level config, image catalogyour own infra audit (Kubernetes audit logs, IAM trails, etc.)
End customer / project ownerper-project Identity profiles, artifact destinationsprofile ref on workflows, identity content via admin UI / CLI / vault backendSagewai audit events — every reveal, every injection, every revocation

How Sagewai protects Tier-2 credentials

Out of the box, Sagewai enforces these properties for every Tier-2 key:

  1. Plaintext never crosses the worker host process boundary. Sealed sets the env on the sandbox container at startup using each backend's native env-injection primitive (docker --env, Kubernetes pod env, Lambda function configuration). The plaintext value is never logged, never written to Postgres, never on the worker host's filesystem.

  2. Outputs are redacted before they leave the sandbox. The host-side RPC seam runs every stdout/stderr/error from the sandbox through a redaction filter built from the run's resolved secret values. If a tool accidentally echoes a key, it's stripped before any audit log or downstream consumer sees it.

  3. Every read is audited. Each decryption, injection, and cascade resolution emits a Sagewai audit event naming which key was used, on what run, by what actor — without naming the value.

  4. Revocation works mid-run. You can revoke a (profile, secret_key) pair: future enqueues fail closed; in-flight runs that already injected the value get aborted (hard-revoke) or expire on the next sandbox-start (soft-revoke).

  5. Cascade rotation is observable. When a profile is rotated between enqueue and sandbox-start, an audit event records the diff (added keys, removed keys) so you can trace which run picked up which version.

  6. Pool reuse is safe. When a sandbox is released to the warm-pool, Tier-2 env is scrubbed. If the scrub fails, the sandbox is discarded, not pooled.

  7. Fail-closed on registry unreachability. If Postgres is unreachable when Sagewai needs to consult the revocation registry, neither enqueue nor sandbox-start proceeds — they raise a registry-unavailable error rather than risk shipping a revoked key.

What you're responsible for

A few things sit outside Sagewai's blast radius and are on you:

  1. Tier-1 protection. Tier-1 keys are operator infrastructure. Use your existing tooling (Kubernetes Secrets, AWS Secrets Manager, dotenv, vault sidecar) to manage them. Sagewai will read them from worker env and that's it.

  2. Backend-escape immunity. A vulnerability in your sandbox backend (Docker daemon, Kubernetes kubelet, Lambda runtime) that lets sandbox code reach the host is the backend vendor's problem. Sagewai layers defense-in-depth (network policies, resource limits, image variants without unnecessary tooling), but it can't defeat a backend escape.

  3. LLM provider trust. When a CLI agent calls Anthropic / OpenAI / etc., the provider sees the prompt content. The redaction layer scrubs known secret values from prompts before egress, but a brand-new secret type that hasn't been added to the redaction rules will pass through. Audit your prompts; configure redaction rules; don't paste secrets into prompts on purpose.

  4. Out-of-band exfiltration. A malicious CLI agent inside a sandbox with NetworkPolicy.FULL can call any URL it likes. The deployment policy — which sandbox image variant gets which network policy — is yours to set. Don't put untrusted CLI agents in FULL networks.

How a Tier-2 secret flows end-to-end

Walking through a single Tier-2 key (OPENAI_API_KEY in profile acme-prod) at a full-CLI-agent step:

1. You create the profile via admin UI or CLI:
       admin UI → POST /api/v1/admin/sealed/profiles
       Stored in the configured Identity backend (builtin file
       store, or HashiCorp Vault, 1Password, AWS Secrets Manager,
       SOPS, Bitwarden — see Sandbox backends).
       Encrypted at rest with a master key (Fernet wrapping for
       the builtin backend; backend-native encryption otherwise).
       Audit event: profile created.
       ↓
2. You reference the profile from a workflow:
       wf.enqueue(security_profile_ref="acme-prod")
       OR set at the workflow level:
       admin-state.workflows[wf_name].security_profile_ref = "acme-prod"
       ↓
3. At enqueue, Sagewai resolves the cascade:
       Combines system-level + workflow-level + user-level profile
       refs and overrides into one effective profile.
       Audit event: cascade resolved.
       ↓
4. The workflow_runs row is persisted with key NAMES only:
       effective_env_keys = ['DEBUG', 'OPENAI_API_KEY']
       effective_secret_keys = ['OPENAI_API_KEY']
       security_profile_ref = 'acme-prod'
       Plaintext values are NEVER persisted at this layer.
       ↓
5. A worker claims the run and dispatches by mode:
       Sagewai re-resolves the cascade (catches rotation drift),
       checks the revocation registry (fails closed if unreachable),
       and produces the env dict that the sandbox backend will use.
       Audit events: profile injected; key decrypted (per key).
       ↓
6. The sandbox backend sets env on the container:
       Docker:  --env OPENAI_API_KEY=sk-…
       K8s:     pod.spec.env or projected secret
       Lambda:  function configuration env
       Plaintext value lives ONLY here, only for the run's lifetime.
       ↓
   Per-key access control narrows further: when a tool call
   spawns a specific CLI agent (claude-code, codex, …), only the
   secret keys the ACL allows for that tool are passed through
   to the subprocess. Non-secret behavior knobs aren't filtered.
       ↓
7. CLI agents inside the sandbox read the keys:
       claude-code → reads os.environ["ANTHROPIC_API_KEY"]
       openai-codex → reads os.environ["OPENAI_API_KEY"]
       Calls the LLM inference point.
       ↓
8. Run completes:
       The cleanup hook scrubs Tier-2 env in the container.
       (If cleanup fails, the container is discarded, not pooled.)
       Audit event: sandbox reset.
       ↓
9. workflow_runs.status = 'completed' (or 'failed').
   Sandbox returned to pool (cleanup ok) or destroyed (cleanup failed).

Common pitfalls

  1. Putting Tier-2 keys in worker env. Worker env is Tier-1 only. If OPENAI_API_KEY belongs to a customer's tool, it goes in their Identity profile, never in the worker process.

  2. Reading Tier-1 keys from inside the sandbox. The sandbox shouldn't need orchestration keys. If a CLI agent inside the sandbox is making orchestration-style decisions, the workflow shape is wrong — orchestration belongs to the Sagewai Agent on the host.

  3. Logging plaintext secret values. logger.info(f"calling api with {api_key}") is forbidden anywhere in your tools. Audit events log key NAMES; redaction strips values from prompts and outputs.

  4. Persisting plaintext in your own tables. If you extend the schema, follow the platform pattern: store key names, not values.

  5. Trusting the LLM's "I'll keep it secret". Don't ask a model to "be careful with this API key" and then put the key in the prompt. Either redact (configure a redaction rule) or don't include it.

  6. Using one profile for both operator and customer keys. Tier 1 is your infra config. Tier 2 is the customer's identity. Mixing them defeats the role separation.

Trust assumption summary

SurfaceTrusted with Tier-1?Trusted with Tier-2?
Operator's infra config (env files, Kubernetes Secrets)n/a (Tier-1 only)
Worker process memory
Postgres workflow_runs columns✗ (key names only)
Postgres revocation table✗ (names only)
Postgres audit-event details✗ (names only — values forbidden)
Builtin Identity profile file (~/.sagewai/profiles.json)✓ (encrypted at rest, Fernet)
External Identity backend (Vault, 1Password, …)✓ (per the backend's security model)
Sandbox os.environ (in-memory, container-scoped)
Container filesystem (/workspace, /tmp, …)✗ unless the tool runner explicitly writes — usually not
LLM inference point (external)✗ (prompts must be redacted)
Artifact destination (GitHub repo, S3, …)✗ (CLI agent uses creds locally; never embeds in artifact content)

The "✓ for Tier-2" rows are the trust boundary. Everything else must treat Tier-2 as forbidden plaintext.

See also