Pillar — Training Loop

Bootstrap with the juggernauts. Capture their answers. Train your own model. Deploy locally. Never pay per-token again.

The Training Loop pillar is the company's reason to be in one sentence. It's the answer to the audience pin's Q3 question — "the CFO is asking why the API bill quadrupled; cut it 50% in 10 weeks without an ML PhD." Capture every Opus answer through the Curator, fine-tune a 3B base model on a free Colab T4, deploy via Ollama, serve real traffic at zero per-token cost.

End-to-end the loop costs under $5 and a weekend.

What the pillar does

Curator — captures every agent answer to JSONL at ~/.sagewai/training/. Auto-instrumented; you don't write capture code.
TrainingDataset, Promoter — promote captured samples to a training-grade dataset.
FineTuneJob — kicks off a fine-tune when the dataset crosses a threshold.
Unsloth integration — real LoRA fine-tunes of small (3-7B) base models on commodity GPUs.
Inference spectrum — five GPU-provisioning tiers, plus Ollama and mlx_lm.server for local deploy. See Inference deployment.
Cost-down measurement — every fine-tune example reports $/call vs the cloud baseline.

What proves it works

Primary lighthouse

Train your own model — the loop end-to-end. Examples 25, 36, 38, 38a, 44-48 compose into the full capture → fine-tune → deploy arc. Real numbers, real LoRA, real cost-down.

Sibling lighthouse

Inference deployment — the deploy half in detail. Five GPU tiers, two local-deploy paths, one SDK surface.

Pattern + foundation

Example 25 — training_data_pipeline — Curator capture surface.
Example 36 — autopilot_training_loop — the loop closes.
Example 38 — unsloth_finetune — real Unsloth fine-tune.
Example 38a — mlx_lm_server_deploy — Apple Silicon deploy.
Example 18 — local_llm_routing — Ollama / LM Studio swap.

Where to go to ship it

Self-Learning Agents — the SDK concept page for the loop.
Training & Fine-Tuning — operator-level guide to running a fine-tune.
Inference — overview — the five-tier comparison.
Inference — start with juggernauts — why Opus and GPT-5 are the right starting point in Q1.
Inference — free CUDA via Colab — the democratisation tutorial.
Inference — rent when you grow — RunPod, Vast.ai, Modal — when to pick which.
Inference — deploy locally — Ollama and LiteLLM, "never pay per-token again" explained.

Autopilot pillar — the autopilot closes the loop by triggering FineTuneJob when the dataset crosses the threshold.
Observatory pillar — the cost-down measurement surface.
Pillars overview — the other four pillars.