Back to full roadmap
topicfoundation
Tokenization
Token = the atomic unit the model sees. Token count = cost + context consumption.
2 hours2 hours3 resources1 prereqs
Models don't see sentences, they see tokens. For GPT/Claude, 1 token ≈ 4 English characters but only ≈ 1.5-2 Turkish characters. Turkish costs more because BPE was trained heavily on English.
Practical:
- Same content in EN may use 30-50% fewer tokens
- Stripping JSON whitespace saves money
- Long-context calls are expensive; send only what's needed
What you'll gain
You'll estimate token counts at a glance and budget cost intentionally.