# Tokenizer Distillation: Cross-Model Token Mapping and TR Token Efficiency Measurement

> Source: https://sukruyusufkaya.com/en/learn/fine-tuning-cookbook/ftc-tokenizer-distillation-token-verimi
> Updated: 2026-05-14T14:42:50.361Z
> Category: Fine-Tuning Cookbook (Model-by-Model)
> Module: Part II — Tokenizer & Data Engineering
**TLDR:** When distilling, teacher and student tokenizers differ → label mismatch. Building cross-tokenizer mapping table for token-level distillation, GPT-4 → Llama-3 distill example, comparison of TR token efficiency (Llama-3 vs Qwen 2.5 vs Gemma 3 vs Mistral vs Phi-4).

