# SFT on Reasoning Traces: Llama-8B + R1-Distilled Traces (8K → 32K Context)

> Source: https://sukruyusufkaya.com/en/learn/fine-tuning-cookbook/ftc-sft-reasoning-traces
> Updated: 2026-05-14T14:42:58.960Z
> Category: Fine-Tuning Cookbook (Model-by-Model)
> Module: Part XII — Reasoning Model FT (R1-style)
**TLDR:** If reasoning trace dataset ready, SFT technically simple but details matter: add <think> tokens to vocab, embedding init, context length 32K (R1 traces 5-15K tokens), loss masking (do think tokens contribute to loss?), epoch count. Llama 3.1 8B + 1000 R1 traces 1 epoch on RTX 4090 ~50 min.

