# Llama 3.1 / 3.2 / 3.3 8B — The Workhorse of RTX 4090: GQA + 128K Context + Turkish Recipe

> Source: https://sukruyusufkaya.com/en/learn/fine-tuning-cookbook/ftc-llama-3x-8b-rtx4090-recipe
> Updated: 2026-05-14T14:42:51.244Z
> Category: Fine-Tuning Cookbook (Model-by-Model)
> Module: Part III — Small Open Models (1B–8B)
**TLDR:** Anatomy of Llama 3.1/3.2/3.3 8B-Instruct: 32-layer × 4096-hidden, GQA (8 KV-head), RoPE θ=500K, SwiGLU, RMSNorm, 128K context. QLoRA NF4 + Unsloth on RTX 4090 with 50K Turkish Alpaca for 1 epoch ~50 min. TR-MMLU baseline 32.4 → fine-tune 39.8 (+23%). Full recipe.

