Self-hosted deployment

Run the full Sagewai stack on your own infrastructure. This guide covers local bring-up, environment configuration, database management, reverse proxy setup, fleet workers, and common failure modes.


Prerequisites

Before you start, install:

  • Docker and Docker Compose v2+
  • PostgreSQL 14+ (or use the provided container)
  • 4 GB RAM minimum for a local deployment; 8 GB or more for production
  • A domain name and TLS certificate for production deployments
  • At least one LLM API key (OpenAI, Anthropic, or Google)

Quick start

git clone https://github.com/sagewai/platform.git
cd platform
cp .env.example .env
# Fill in your API keys in .env
make start          # infra + DB migrations + admin backend
make web APP=admin  # admin frontend (run in a second terminal)

make start brings up PostgreSQL, Redis, Milvus, NebulaGraph, and the observability stack, applies database migrations, and starts the admin backend. The frontend runs separately.

Once up:

  • Admin API at http://localhost:8000
  • Admin UI at http://localhost:3008 (after make web)
  • Grafana at http://localhost:3200

Architecture overview

+---------------------------------------------------------+
|                Reverse Proxy (nginx / caddy)             |
|                TLS termination                           |
+-------------+--------------+----------------------------+
|  Admin UI   |  Admin API   |  Fleet Gateway             |
|  :3008      |  :8000       |  :8000/api/v1/fleet        |
+-------------+--------------+----------------------------+
|                                                         |
|   PostgreSQL :5432    Redis :6379                        |
|   Milvus :19530       NebulaGraph :9669                 |
|                                                         |
+---------------------------------------------------------+
ServicePurposePort
PostgreSQL 14+State, workflows, fleet, audit5432
Redis 7Cache, sessions6379
Milvus 2.3Vector embeddings for RAG19530
NebulaGraph 3.6Knowledge graph, relations9669
GrafanaDashboards, alerting3200
PrometheusMetrics scraping9090
OTel CollectorDistributed tracing4317
LocalStackS3-compatible archive storage4566

Docker Compose profiles

Sagewai ships two compose files.

Full infrastructure (docker-compose.yml)

Starts all infrastructure and observability services:

make infra       # Postgres, Redis, Milvus, NebulaGraph, Grafana, Prometheus, OTel
make infra-core  # Postgres + Redis only

Development stack (docker-compose.dev.yml)

Extends the infrastructure compose with dockerized backends (admin, nexus, chronicles, haus) and a db-migrate init container:

make start   # Infrastructure + migrations + admin backend + frontend
make down    # Stop everything

Production overrides

Create a docker-compose.prod.yml overlay for production resource limits:

# docker-compose.prod.yml
services:
  postgres:
    deploy:
      resources:
        limits:
          memory: 2G
    restart: always
    logging:
      driver: json-file
      options:
        max-size: "50m"
        max-file: "5"

  redis:
    restart: always
    deploy:
      resources:
        limits:
          memory: 512M

  milvus:
    restart: always
    deploy:
      resources:
        limits:
          memory: 4G

Start with both files:

docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d

Environment variables

Copy .env.example and fill in the values you need.

Required

VariableDescriptionDefault
DATABASE_URLPostgreSQL connection stringpostgresql://sagewai:sagewai_password@localhost:5432/sagewai
OPENAI_API_KEYOpenAI API key (or another provider key)
JWT_SECRETSecret for JWT token signing

Infrastructure

VariableDescriptionDefault
REDIS_URLRedis connection URLredis://localhost:6379
MILVUS_HOSTMilvus vector DB hostlocalhost
MILVUS_PORTMilvus vector DB port19530
MILVUS_URIMilvus URI (for Zilliz Cloud)
MILVUS_TOKENMilvus auth token (for Zilliz Cloud)
NEBULA_HOSTNebulaGraph hostlocalhost
NEBULA_PORTNebulaGraph port9669
NEBULA_USERNebulaGraph usernameroot
NEBULA_PASSWORDNebulaGraph passwordnebula

AI providers

VariableDescriptionRequired
OPENAI_API_KEYOpenAI API keyAt least one
ANTHROPIC_API_KEYAnthropic API keyOptional
GOOGLE_API_KEYGoogle AI API keyOptional
LITELLM_PROXY_URLLiteLLM proxy URLOptional
LITELLM_API_KEYLiteLLM master keyOptional

Security

VariableDescriptionNotes
JWT_SECRETJWT signing secretRequired for auth
JWT_ALGORITHMJWT algorithmDefault: HS256
SAGEWAI_ENCRYPTION_KEYFernet key for encrypting stored secretsRecommended

Observability

VariableDescriptionDefault
LOG_LEVELLogging levelINFO
LOG_FORMATLog format (json or text)json
OTEL_EXPORTER_OTLP_ENDPOINTOpenTelemetry collector endpointhttp://localhost:4317

Database setup

Initial setup

Apply migrations after starting PostgreSQL:

make db-upgrade   # Apply all Alembic migrations
make db-seed      # Seed demo data (optional)

Or do both together:

make db-fresh     # Drop schema + migrate + seed

Backup and restore

# Backup
pg_dump -h localhost -U sagewai sagewai > backup.sql

# Restore
psql -h localhost -U sagewai sagewai < backup.sql

Migration management

make db-upgrade   # Apply pending migrations
make db-reset     # Drop and recreate schema (destructive)
make db-fresh     # Reset + migrate + seed

Caution: db-reset drops the entire schema. Back up first on any system that holds real data.


Reverse proxy configuration

Place a reverse proxy in front of Sagewai in production to terminate TLS.

Nginx

server {
    listen 443 ssl http2;
    server_name sagewai.example.com;

    ssl_certificate     /etc/nginx/certs/cert.pem;
    ssl_certificate_key /etc/nginx/certs/key.pem;

    # Admin frontend
    location / {
        proxy_pass http://localhost:3008;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }

    # Admin API + Fleet gateway
    location /api/ {
        proxy_pass http://localhost:8000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }

    # WebSocket support (AG-UI, live updates)
    location /ws/ {
        proxy_pass http://localhost:8000;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;
        proxy_read_timeout 86400;
    }

    # SSE streams (notifications, workflow events)
    location /api/v1/notifications/stream {
        proxy_pass http://localhost:8000;
        proxy_set_header Host $host;
        proxy_buffering off;
        proxy_cache off;
        proxy_read_timeout 86400;
    }
}

Caddy (automatic TLS)

Caddy handles Let's Encrypt certificate issuance and renewal automatically:

sagewai.example.com {
    handle /api/* {
        reverse_proxy localhost:8000
    }
    handle /ws/* {
        reverse_proxy localhost:8000
    }
    handle {
        reverse_proxy localhost:3008
    }
}

TLS / HTTPS

Three options:

  1. Let's Encrypt via Caddy — automatic certificate issuance and renewal, no manual configuration.
  2. Let's Encrypt via certbot — use with nginx; configure a renewal cron job.
  3. Organizational CA — for internal or enterprise deployments behind a corporate proxy.

For internal deployments with a self-signed certificate:

openssl req -x509 -nodes -days 365 \
  -newkey rsa:2048 \
  -keyout /etc/nginx/certs/key.pem \
  -out /etc/nginx/certs/cert.pem \
  -subj "/CN=sagewai.internal"

Fleet workers

Fleet workers connect to the central gateway and run workflows on your own hardware with your own LLM API keys.

Docker

docker run -d \
  -e FLEET_GATEWAY_URL=https://sagewai.example.com \
  -e ENROLLMENT_KEY=ek-your-enrollment-key-here \
  -e WORKER_POOL=default \
  -e OPENAI_API_KEY=sk-... \
  sagewai/worker:latest

Note: The sagewai/worker image is not yet published to a registry. Build it locally first:

docker build -t sagewai/worker:latest infra/docker/worker/

Docker Compose

Copy the example file from infra/docker/worker/docker-compose.example.yml:

cd infra/docker/worker
cp docker-compose.example.yml docker-compose.yml
# Set FLEET_GATEWAY_URL, ENROLLMENT_KEY, and API keys

# Start one worker
docker compose up -d

# Scale to 3 workers
docker compose up -d --scale worker=3

With local models (Ollama)

Workers can advertise locally available models to the gateway:

docker run -d \
  -e FLEET_GATEWAY_URL=https://sagewai.example.com \
  -e ENROLLMENT_KEY=ek-your-enrollment-key-here \
  -e WORKER_POOL=gpu-workers \
  -e WORKER_MODELS=llama3,mistral \
  -e WORKER_LABELS="gpu=true,env=production" \
  sagewai/worker:latest

Worker enrollment flow

  1. Generate an enrollment key in the admin panel at Fleet > Enrollment Keys > New Key, or run sagewai fleet create-key.
  2. Start the worker with the enrollment key set as ENROLLMENT_KEY.
  3. Approve the worker in the admin panel at Fleet > Workers.
  4. The worker begins receiving tasks immediately after approval.

Monitoring

After running make infra, the following observability tools are available:

ToolURLPurpose
Grafanahttp://localhost:3200Dashboards for agent metrics, costs, latency
Prometheushttp://localhost:9090Metrics scraping and alerting
OTel Collector:4317 (gRPC)Distributed tracing (Jaeger/Zipkin-compatible)

The admin panel also provides built-in health monitoring at Settings > Infrastructure.


Health checks

Verify your deployment with:

# Admin API health summary
curl http://localhost:8000/api/v1/health/summary

# Detailed component status (Postgres, Redis, Milvus, NebulaGraph)
curl http://localhost:8000/api/v1/health/detailed

Both endpoints return JSON with per-component status.


Upgrading

# Pull latest code
git pull origin main

# Apply any new migrations
make db-upgrade

# Restart all services
make start

If you run images directly:

docker compose pull
docker compose -f docker-compose.yml -f docker-compose.dev.yml up -d
make db-upgrade

Back up your database before applying migrations on a production system.


Running without optional services

Sagewai degrades when optional infrastructure is unavailable:

ServiceIf missingFallback
MilvusVector search disabledIn-memory embeddings (development only)
NebulaGraphGraph memory disabledIn-memory graph store
RedisNo cachingDirect DB queries
LocalStackNo S3 archivalLocal filesystem archives

The minimum viable deployment requires PostgreSQL only. Start with:

make infra-core   # Postgres + Redis
make db-upgrade
make dev-native APP=admin

Troubleshooting

Milvus fails to start

Milvus depends on etcd and MinIO. If it fails with connection errors, check both dependencies:

docker compose logs etcd
docker compose logs minio
docker compose logs milvus

Milvus can take up to 90 seconds to become healthy on first startup. The start_period in the healthcheck configuration controls this window.

NebulaGraph connection refused

NebulaGraph has three components (metad, storaged, graphd) that start in a specific order. Wait until all three show healthy before connecting:

docker compose ps | grep nebula

Database migration errors

If migrations fail after pulling new code:

make db-reset      # Drop and recreate schema
make db-upgrade    # Apply all migrations from scratch
make db-seed       # Re-seed demo data if needed

Out of memory

Milvus uses the most memory of any component. For memory-constrained environments:

  1. Use make infra-core instead of make infra — this skips Milvus, NebulaGraph, and the observability stack.
  2. Set memory limits in a production compose override.
  3. Consider using Zilliz Cloud (managed Milvus) to offload vector storage.

Port conflicts

Default ports used by Sagewai:

PortService
3008Admin frontend
3200Grafana
4317OTel Collector
4566LocalStack (S3)
5432PostgreSQL
6379Redis
8000Admin backend
9090Prometheus
9669NebulaGraph
19530Milvus

If any port is already in use, stop the conflicting service or change the port mapping in docker-compose.yml.