Fine-Tuning Cookbook (Model-by-Model)

User manual for this cookbook: 5-component lesson anatomy (Theory/Math/Lab/Debug/Bench), Stage taxonomy (Spike → Reference → Production → Research), reproducibility contract (bit-exact runs), why the RTX 4090 baseline, GPU budgeting math.

20 modules

177 lessons

~5111 min

Part 0 — Engineering Foundations

1
Welcome to the Fine-Tuning Cookbook: System, Stage Taxonomy, and the Reproducibility Contract
User manual for this cookbook: 5-component lesson anatomy (Theory/Math/Lab/Debug/Bench), Stage taxonomy (Spike → Reference → Production → Research), reproducibility contract (bit-exact runs), why the RTX 4090 baseline, GPU budgeting math.
2
Reproducibility Stack: Seeds, cuDNN Flags, and Deterministic CUDA — End the 'Works on My Machine' Problem
ML's most expensive time sink: irreproducible results. This lesson: seed management, cuDNN/cuBLAS deterministic flags, ATen non-deterministic op detection, dataloader worker seeding, cost of deterministic scatter/gather — all with practical code and real logs.
3
Environment Pinning: uv + pyproject.toml, CUDA Version Matrix, and Container Recipes
Second half of reproducibility: pin lib versions, understand the CUDA matrix, write Docker/Apptainer recipes. Where uv beats pip+poetry by 10-100x, the CUDA 12.4 / PyTorch 2.5 stack for RTX 4090, compatibility matrix for FT frameworks (TRL, Unsloth, Axolotl).
4
Container & Slurm Recipes: Bridging Single 4090 to Cloud Multi-Node
How to take a recipe you prepared on a single 4090 to an 8×H100 cluster: Slurm sbatch template, multi-node NCCL setup, EFA/InfiniBand sanity check, real hourly prices for Lambda/RunPod/CoreWeave/Vast, preemption-tolerant training, checkpoint manifest, FAULT_TOLERANCE principles.
5
Experiment Tracking Architecture: W&B + Hydra + DVC — The Engineering of Sweeps
Disciplining ML experiments: config-driven runs with Hydra, sweep + system metrics + offline mode with W&B, dataset/checkpoint versioning with DVC, alias/lineage tracking. The cookbook's 'reportable Lab' standard.

Part I — Hardware & Memory Engineering

Part II — Tokenizer & Data Engineering

Part III — Small Open Models (1B–8B)

Part IV — Mid-Large Models (13B-70B+) + Distributed Internals

Part V — MoE Internals & Fine-Tuning

Part VI — Vision-Language Multimodal FT

Fine-Tuning Cookbook (Model-by-Model)

Table of Contents

Part 0 — Engineering Foundations

Part I — Hardware & Memory Engineering

Part II — Tokenizer & Data Engineering

Part III — Small Open Models (1B–8B)

Part IV — Mid-Large Models (13B-70B+) + Distributed Internals

Part V — MoE Internals & Fine-Tuning

Part VI — Vision-Language Multimodal FT

Part VII — Speech & Audio Fine-Tuning

Part VIII — Code Models & Repo-Level FT

Part IX — Turkish-First & Localization Engineering

Part X — Quantization Engineering

Part XI — Alignment & Preference Optimization

Part XII — Reasoning Model FT (R1-style)

Part XIII — Custom Kernels & Performance Surgery

Part XIV — Closed-Source API Fine-Tuning

Part XV — Serving Engineering

Part XVI — Production Operations

Part XVII — Turkey Use-Case Labs

Part XVIII — Compliance, Governance & Red-Teaming

Capstone — Build Your Own LLM

Subscribe to Newsletter