# Tokenizer Extension Lab: Llama-3 → +8K TR Tokens + Embedding Init

> Source: https://sukruyusufkaya.com/en/learn/fine-tuning-cookbook/ftc-tr-tokenizer-extension-llama3-lab
> Updated: 2026-05-14T14:42:56.222Z
> Category: Fine-Tuning Cookbook (Model-by-Model)
> Module: Part IX — Turkish-First & Localization Engineering
**TLDR:** Part II Lesson 2.2's TR-specific full Lab. Add 8K most-frequent TR tokens to Llama 3.1 tokenizer, try byte-decomposition + SVD init, measure perplexity delta, downstream SFT after 500M token continual pre-train: tokens/word 3.2 → 2.1.

