# LLaVA-1.5 / 1.6 / OneVision: 2-Stage Training + Projector Pretrain + Instruction Tune

> Source: https://sukruyusufkaya.com/en/learn/fine-tuning-cookbook/ftc-llava-family-2-stage-training
> Updated: 2026-05-14T14:42:53.902Z
> Category: Fine-Tuning Cookbook (Model-by-Model)
> Module: Part VI — Vision-Language Multimodal FT
**TLDR:** LLaVA's classic 2-stage training recipe: (1) Projector-only pretrain on 558K image-caption pairs, (2) end-to-end instruction tune. Freeze strategy ablation (vision frozen vs unfrozen, LLM frozen vs unfrozen). LLaVA-1.6 Mistral 7B FT on RTX 4090.

