Pillar — Training Loop
Bootstrap with the juggernauts. Capture their answers. Train your own model. Deploy locally. Never pay per-token again.
The Training Loop pillar is the company's reason to be in one sentence. It's the answer to the audience pin's Q3 question — "the CFO is asking why the API bill quadrupled; cut it 50% in 10 weeks without an ML PhD." Capture every Opus answer through the Curator, fine-tune a 3B base model on a free Colab T4, deploy via Ollama, serve real traffic at zero per-token cost.
End-to-end the loop costs under $5 and a weekend.
What the pillar does
- Curator — captures every agent answer to JSONL at
~/.sagewai/training/. Auto-instrumented; you don't write capture code. TrainingDataset,Promoter— promote captured samples to a training-grade dataset.FineTuneJob— kicks off a fine-tune when the dataset crosses a threshold.- Unsloth integration — real LoRA fine-tunes of small (3-7B) base models on commodity GPUs.
- Inference spectrum — five GPU-provisioning tiers, plus Ollama and
mlx_lm.serverfor local deploy. See Inference deployment. - Cost-down measurement — every fine-tune example reports $/call vs the cloud baseline.
What proves it works
Primary lighthouse
Train your own model — the loop end-to-end. Examples 25, 36, 38, 38a, 44-48 compose into the full capture → fine-tune → deploy arc. Real numbers, real LoRA, real cost-down.
Sibling lighthouse
Inference deployment — the deploy half in detail. Five GPU tiers, two local-deploy paths, one SDK surface.
Pattern + foundation
- Example 25 —
training_data_pipeline— Curator capture surface. - Example 36 —
autopilot_training_loop— the loop closes. - Example 38 —
unsloth_finetune— real Unsloth fine-tune. - Example 38a —
mlx_lm_server_deploy— Apple Silicon deploy. - Example 18 —
local_llm_routing— Ollama / LM Studio swap.
Where to go to ship it
- Self-Learning Agents — the SDK concept page for the loop.
- Training & Fine-Tuning — operator-level guide to running a fine-tune.
- Inference — overview — the five-tier comparison.
- Inference — start with juggernauts — why Opus and GPT-5 are the right starting point in Q1.
- Inference — free CUDA via Colab — the democratisation tutorial.
- Inference — rent when you grow — RunPod, Vast.ai, Modal — when to pick which.
- Inference — deploy locally — Ollama and LiteLLM, "never pay per-token again" explained.
Related
- Autopilot pillar — the autopilot closes the loop by triggering
FineTuneJobwhen the dataset crosses the threshold. - Observatory pillar — the cost-down measurement surface.
- Pillars overview — the other four pillars.