# ORPO: Odds Ratio Preference Optimization — Single-Stage SFT+Alignment

> Source: https://sukruyusufkaya.com/en/learn/fine-tuning-cookbook/ftc-orpo-single-stage-sft-alignment
> Updated: 2026-05-14T14:42:58.083Z
> Category: Fine-Tuning Cookbook (Model-by-Model)
> Module: Part XI — Alignment & Preference Optimization
**TLDR:** ORPO (Hong et al. 2024) — DPO alternative without SFT base requirement. SFT loss + odds-ratio preference loss in one stage. No ref model → memory savings. Reference-free training, λ hyperparameter, ORPO Lab on RTX 4090.

