Hardware Requirements

Sagewai runs anywhere from a laptop to a multi-node cluster. This guide covers system requirements for every deployment scenario.

Deployment Profiles

ProfileRAMCPUDiskGPUUse Case
SDK Only512 MBAny 2-core500 MBNonepip install + cloud APIs
Lightweight Dev2 GB2-core2 GBNonePostgres + Redis only
Full Dev Stack8 GB+4-core15 GBNoneAll infrastructure services
Production16 GB+8-core50 GB+OptionalSelf-hosted with observability

Per-Service Memory Breakdown

ServiceMemoryPurpose
PostgreSQL 15~200 MBState, workflows, fleet, audit
Redis 7~50 MBCache, sessions
Milvus 2.3 (etcd + MinIO + standalone)~2 GBVector embeddings for RAG
NebulaGraph 3.6 (metad + storaged + graphd)~1.5 GBKnowledge graph, relations
Observability (Grafana + Prometheus + OTel + Loki + Tempo)~1.5 GBDashboards, metrics, tracing
LocalStack~300 MBS3-compatible archive (dev)
Total (full stack)~5.5 GB

GPU Requirements for Local Inference

TaskVRAMRecommended GPUCPU Alternative
Ollama 7B (q4_K_M)4 GBRTX 3060Yes, ~10 tok/s
Ollama 13B (q4_K_M)8 GBRTX 3070Yes, ~5 tok/s
Ollama 70B (q4_K_M)40 GBA100 / 2x RTX 3090Impractical
vLLM serving (7B-70B)8-80 GBNVIDIA A10G+No
Unsloth fine-tune 7B6 GBRTX 3060+Very slow
Unsloth fine-tune 13B12 GBRTX 3090+Impractical
Sentence-transformers512 MBAnyYes (default)
GLiNER NER512 MBAnyYes (default)
faster-whisper (base)1 GBAnyYes

GPU is only needed for local inference. If you use cloud APIs (OpenAI, Anthropic, Google), no GPU is required at all.

Disk Usage Reference

ComponentSize
SDK + core dependencies~500 MB
Intelligence extras (torch, transformers)~5 GB
Ollama model weights4-40 GB per model
Milvus data (per 1M vectors, 1536-dim)~6 GB
Container images (full stack)~8 GB
NebulaGraph data~500 MB per 1M edges

Cloud Instance Recommendations

ProfileAWSGCPAzure
SDK Onlyt3.smalle2-smallB2s
Lightweight Devt3.mediume2-mediumB2ms
Full Dev Stackm5.xlargee2-standard-4D4s_v5
Production (no GPU)m5.2xlargee2-standard-8D8s_v5
Production (GPU)g5.xlarge (A10G)g2-standard-4 (L4)NC6s_v3 (V100)

Apple Silicon Notes

Sagewai runs natively on Apple Silicon (M1/M2/M3/M4) Macs:

  • Ollama uses Metal GPU acceleration automatically — no configuration needed
  • Sentence-transformers and GLiNER work on CPU (MPS support varies)
  • Docker runs via Docker Desktop or OrbStack with Rosetta 2 emulation for x86 images
  • Recommended: MacBook Pro M2+ with 16 GB unified memory for full dev stack

Minimum Requirements Summary

At minimum, you need Python 3.10+ and 512 MB of free RAM. Everything else is optional and scales with your needs. Sagewai degrades gracefully — without Milvus, it uses in-memory vectors; without NebulaGraph, in-memory graphs; without Redis, direct database queries.