Back to full roadmap
topicadvanced
Multi-Model Orchestration
Use 3-5 models by role instead of one — optimal cost/quality balance.
3 hours1 resources1 prereqs
Classic production pattern: router + planner + executor + judge, each a different model.
- Router (Haiku/4o-mini): classify user question, route to correct pipeline. <100ms, ~$0.0001.
- Planner (Opus thinking / o1): if complex, build the plan. ~5-30s, ~$0.05.
- Executor (Sonnet / 4o): execute each step, call tools. ~3s, ~$0.005.
- Judge (Sonnet): evaluate output, trigger retry on error. ~2s, ~$0.003.
Cost comparison: all-Opus = $0.50/request. Multi-model = $0.08/request. 6× savings + better quality.
Complexity: 4 models = 4 SDK calls, prompt templates, error handlers. Frameworks (LangGraph, Mastra) ease this orchestration.